Wednesday, August 29, 2012

Literate Programming

Donald Knuth pioneered Literate programming as a technique for writing structured programs. The literate programming world typically has more descriptive text than code, so rather than "comment out" the descriptive text, they "comment in" the code.

It is very popular in some communities, for example Haskell, where even many blog posts are written in a Literate Haskell style. This is done by creating a .lhs file instead of a .hs file. In it, all lines starting with > are interpreted as code, everything else is considered a comment.

Graham Telfer asked a question on the mailing list if Factor supported literate programming. I thought it might be fun to implement some of the ideas quickly, below I'll share how it works.

The Lexer

Factor uses a lexer object to turn a stream of text into tokens that are used by the parser to turn tokens into Factor objects and definitions. We can extend the lexer system to create a version that skips over any lines that are not prefixed by a > character.

First, we define our literate-lexer sub-class:

TUPLE: literate-lexer < lexer ;

: <literate-lexer> ( text -- lexer ) literate-lexer new-lexer ;

Second, we implement the skip-blank word to skip over all lines that are just comments:

M: literate-lexer skip-blank
    dup column>> zero? [
        dup line-text>> [
            "> " head?
            [ [ 2 + ] change-column call-next-method ]
            [ [ nip length ] change-lexer-column ]
            if
        ] [ drop ] if*
    ] [ call-next-method ] if ;

We can then create a quick syntax word that looks for an end token and parses all the lines between:

SYNTAX: <LITERATE
    "LITERATE>" parse-multiline-string string-lines [
        <literate-lexer> (parse-lines) append!
    ] with-nested-compilation-unit ;

Try It

Now, you can do something like this:

IN: scratchpad <LITERATE
               This is a big wall of text with no Factor code...
               Does this work?
               1 1 + .

               I bet it didn't... maybe the following works:
               > 8675309 .
               Yay!

               You can create functions that span multiple lines
               > : foo ( -- x )
               We interrupt this program to bring you this:
               > 12 ;

               Now, we can run foo:
               > foo .
               LITERATE>
8675309
12

It might be nice to automatically support .lfactor files, but this is a quick prototype to see if it makes sense. Not bad for ten or so lines of code?

It is available now in the development branch of Factor, in the literate vocabulary.

No comments: