Re: Factor: January 2012

Wednesday, January 25, 2012

Colored Timestamps

I noticed a fun post in early December that implements a mapping between current time and a "unique" RGBA color. I thought it might be fun to use Factor to implement a colored clock.

The basic concept is to map the 4,294,967,296 unique RGBA colors to seconds, which gives just over 136 years of unique colors.

timestamp>rgba

We calculate timestamps as an offset from Dennis Ritchie's birthday:

: start-date ( -- timestamp )
    1941 9 9 <date> ; inline

The offset is an elapsed number of seconds from the start date:

: elapsed ( timestamp -- seconds )
    start-date time- duration>seconds >integer ;

The conversion from a timestamp into a unique RGBA color does successive divmod operations to map into Red, Green, Blue, and Alpha values:

: timestamp>rgba ( timestamp -- color/f )
    elapsed dup 0 32 2^ between? [
        24 2^ /mod 16 2^ /mod 8 2^ /mod
        [ 255 /f ] 4 napply <rgba>
    ] [ drop f ] if ;

You can try it for yourself, showing how the values change over time:

IN: scratchpad start-date timestamp>rgba .
T{ rgba
    { red 0.0 }
    { green 0.0 }
    { blue 0.0 }
    { alpha 0.0 }
}

IN: scratchpad now timestamp>rgba .
T{ rgba
    { red 0.5176470588235295 }
    { green 0.3803921568627451 }
    { blue 0.4313725490196079 }
    { alpha 0.3333333333333333 }
}

<rgba-clock>

Let's use the timestamp>rgba word to make an updating "colored clock". Specifically, we can use an arrow model to update a label every second to create an RGBA clock:

: update-colors ( color label -- )
    [ font>> background<< ]
    [ [ <solid> ] dip [ interior<< ] [ boundary<< ] 2bi ]
    2bi ;

: <rgba-clock> ( -- gadget )
    f <label-control>
        time get over '[
            [ timestamp>rgba _ update-colors ]
            [ timestamp>hms ] bi
        ] <arrow> >>model
        "HH:MM:SS" >>string
        monospace-font >>font ;

Use the gadget. word to try it in your listener, and watch it update:

IN: scratchpad <rgba-clock> gadget.

The code for this is on my Github.

Friday, January 13, 2012

In honor of January 13, 2012, a Friday the 13th, I thought it might be fun to use Factor to explore similar dates in past and future history. According to Wikipedia, such a day "occurs at least once, but at most three times a year".

friday-13th?

A day is "Friday the 13th" if it is both (a) Friday and (b) the 13th:

: friday-13th? ( timestamp -- ? )
    [ day>> 13 = ] [ friday? ] bi and ;

Trying it for today and tomorrow, to make sure it works:

IN: scratchpad now friday-13th? .
t

IN: scratchpad : tomorrow ( -- timestamp )
                   now 1 days time+ ;

               tomorrow friday-13th? .
f

friday-13ths

Getting all Friday the 13th's for a given year:

: friday-13ths ( year -- seq )
    12 [0,b) [
        13 <date> dup friday? [ drop f ] unless
    ] with map sift ;

Or, for a range of years:

: all-friday-13ths ( start-year end-year -- seq )
    [a,b] [ friday-13ths ] map concat ;

Trying it for 2012:

IN: scratchpad 2012 friday-13ths .
{
    T{ timestamp
        { year 2012 }
        { month 1 }
        { day 13 }
    }
    T{ timestamp
        { year 2012 }
        { month 4 }
        { day 13 }
    }
    T{ timestamp
        { year 2012 }
        { month 7 }
        { day 13 }
    }
}

next-friday-13th

We can iterate, looking for the next Friday the 13th:

: next-friday-13th ( timestamp -- date )
    dup day>> 13 >= [ 1 months time+ ] when 13 >>day
    [ dup friday? not ] [ 1 months time+ ] while ;

Trying it for today, shows the next Friday the 13th is April, 13, 2012:

IN: scratchpad now next-friday-13th .
T{ timestamp
    { year 2012 }
    { month 4 }
    { day 13 }
}

The code (and some tests) for this is on my Github.

Tuesday, January 3, 2012

Duplicate Files

A few months ago, Jon Cooper wrote a duplicate file checker in Go and Ruby.

Below, I contribute a simple version in Factor that runs faster than both Go and Ruby solutions. In the spirit of the original article, I have separated the logic into steps.

Argument Parsing

The command-line vocabulary gives us the arguments passed to the script. We check for the verbose flag and the root directory to traverse:

: arg? ( name args -- args' ? )
    2dup member? [ remove t ] [ nip f ] if ;

: parse-args ( -- verbose? root )
    "--verbose" command-line get arg? swap first ;

Filesystem Traversal

We can traverse the filesystem with the each-file word (choosing breadth-first instead of depth-first). In our case, we want to collect these files into a map of all paths that share a common filename:

: collect-files ( path -- assoc )
    t H{ } clone [
        '[ dup file-name _ push-at ] each-file
    ] keep ;

Our duplicate files are those files that share a common filename:

: duplicate-files ( path -- dupes )
    collect-files [ nip length 1 > ] assoc-filter! ;

MD5 Hashing Files

Using the checksums.md5 vocabulary, it is quite simple:

: md5-file ( path -- string )
    md5 checksum-file hex-string ;

Printing Results

If verbose is selected, then we print each filename and the MD5 checksum for each full path:

: print-md5 ( name paths -- )
    [ "%s:\n" printf ] [
        [ dup md5-file "  %s\n    %s\n" printf ] each
    ] bi* ;

We put this all together by calculating the possible duplicate files, optionally printing verbose MD5 checksums, and then print the total number of duplicates detected:

: run-dupe ( -- )
    parse-args duplicate-files swap
    [ dup [ print-md5 ] assoc-each ] when
    assoc-size "Total duped files found: %d\n" printf ;

Performance

I tested performance using two directory trees, one with over 500 files and another with almost 36,000 files. While the original article focuses more on syntax than speed, it is nice to see that the Factor solution is faster than the Go and Ruby versions.

Duplicates Factor Go Ruby

583 1.453 2.298 3.861

35,953 19.084 24.452 30.597

The above time is seconds on my laptop.

Duplicates	Factor	Go	Ruby
583	1.453	2.298	3.861
35,953	19.084	24.452	30.597

The code for this is on my Github.