Wednesday, December 28, 2011

Elementology

Question: What do the words bamboo, crunchy, finance, genius, and tenacious have in common? I'll give you a hint: its the same thing they have in common with the words who, what, when, where, and how?

Stumped? Well, it's not that these are all English words.

Answer: All of these words can be spelled using elements from the periodic table!

I was recently inspired by the Periodic GeNiUS T-shirt from ThinkGeek and a website that can "make any words out of elements in the periodic table". I thought it would be fun to use Factor to see how many other words can be spelled using the symbols for chemical elements.

First, we need a list of elements:

: elements ( -- assoc )
    H{
        { "H" "Hydrogen" }
        { "He" "Helium" }
        { "Li" "Lithium" }
        { "Be" "Beryllium" }
        { "B" "Boron" }
        { "C" "Carbon" }
        ...
        { "Uut" "Ununtrium" }
        { "Uuq" "Ununquadium" }
        { "Uup" "Ununpentium" }
        { "Uuh" "Ununhexium" }
        { "Uus" "Ununseptium" }
        { "Uuo" "Ununoctium" }
    } [ [ >lower ] dip ] assoc-map ;

Next, a word that checks if a particular substring is the symbol of an element:

: element? ( from to word -- ? )
    2dup length > [ 3drop f ] [ subseq elements key? ] if ;

We know that symbols are only ever one, two, or three characters. A word is considered "periodic" if it can be composed of any number of (possibly repeating) element symbols. We build a recursive solution that starts with the first character and continues as long as element symbols are a match or until the end of the word is reached:

: (periodic?) ( word from -- ? )
    {
        [ swap length = ]
        [
            { 1 2 3 } [
                dupd + [ pick element? ] keep
                '[ dup _ (periodic?) ] [ f ] if
            ] with any? nip
        ]
    } 2|| ;

: periodic? ( word -- ? )
    >lower 0 (periodic?) ;

It's easy to get a list of dictionary words from most Unix systems:

: dict-words ( -- words )
    "/usr/share/dict/words" ascii file-lines ;

And then a list of all "periodic words":

: periodic-words ( -- words )
    dict-words [ periodic? ] filter ;

So, how many words are "periodic words"? About 13.7% of them.

IN: scratchpad dict-words length .
235886

IN: scratchpad periodic-words length .
32407

The code for this is on my Github.

No comments: