Thursday, October 28, 2010

Syntax Highlighting

Update: I modified the code examples below to use the new colors.hex vocabulary.

It's a sometimes overlooked feature that Factor contains a syntax highlighting vocabulary that supports a decent number of programming languages. The vocabulary is called xmode, and is used to highlight entries submitted to Factor's pastebin.

I thought it would be fun to use xmode to implement syntax highlighting in the listener. When we're done, it will look something like this:


First, we will need to use several vocabularies:

USING: accessors assocs colors io io.encodings.utf8 io.files
io.styles kernel literals locals math math.parser sequences
xmode.catalog xmode.marker ;

We define named styles that will apply to the tokens that are parsed by the syntax highlighter. For convenience, we will use the colors.hex vocabulary to convert hexadecimal values to color tuples to build the stylesheet. In the stylesheet below, we use the same colors that are used in the pastebin.

CONSTANT: STYLES H{
    { "NULL"     H{ { foreground HEXCOLOR: 000000 } } }
    { "COMMENT1" H{ { foreground HEXCOLOR: cc0000 } } }
    { "COMMENT2" H{ { foreground HEXCOLOR: ff8400 } } }
    { "COMMENT3" H{ { foreground HEXCOLOR: 6600cc } } }
    { "COMMENT4" H{ { foreground HEXCOLOR: cc6600 } } }
    { "DIGIT"    H{ { foreground HEXCOLOR: ff0000 } } }
    { "FUNCTION" H{ { foreground HEXCOLOR: 9966ff } } }
    { "INVALID"  H{ { background HEXCOLOR: ffffcc }
                    { foreground HEXCOLOR: ff0066 } } }
    { "KEYWORD1" H{ { foreground HEXCOLOR: 006699 }
                    { font-style bold } } }
    { "KEYWORD2" H{ { foreground HEXCOLOR: 009966 }
                    { font-style bold } } }
    { "KEYWORD3" H{ { foreground HEXCOLOR: 0099ff }
                    { font-style bold } } }
    { "KEYWORD4" H{ { foreground HEXCOLOR: 66ccff }
                    { font-style bold } } }
    { "LABEL"    H{ { foreground HEXCOLOR: 02b902 } } }
    { "LITERAL1" H{ { foreground HEXCOLOR: ff00cc } } }
    { "LITERAL2" H{ { foreground HEXCOLOR: cc00cc } } }
    { "LITERAL3" H{ { foreground HEXCOLOR: 9900cc } } }
    { "LITERAL4" H{ { foreground HEXCOLOR: 6600cc } } }
    { "MARKUP"   H{ { foreground HEXCOLOR: 0000ff } } }
    { "OPERATOR" H{ { foreground HEXCOLOR: 000000 }
                    { font-style bold } } }
}

The xmode.catalog vocabulary provides support for looking up the type (or "mode") of the file. The xmode.marker vocabulary then provides support for converting each line into a stream of tokens. Each token allows access to a named style. Once we have the name of the appropriate style, we can then look it up and format the output.

Putting that all together, we can implement the highlight. word.

: highlight-tokens ( tokens -- )
    [
        [ str>> ] [ id>> ] bi
        [ name>> STYLES at ] [ f ] if* format
    ] each nl ;

: highlight-lines ( lines mode -- )
    [ f ] 2dip load-mode [
        tokenize-line highlight-tokens
    ] curry each drop ;

:: highlight. ( path -- )
    path utf8 file-lines [
        path over first find-mode highlight-lines
    ] unless-empty ;

You should be able to paste the above code examples into your listener, to try it for yourself.

2 comments:

Unknown said...

This code gives an error on the last word. To begin with you have two colons beginning the definition and then the word 'path' is used both in the stack effect statement and as a word (undefined) in the main section of the definition.

I'm not sure if I'm missing something here.

mrjbq7 said...

Hi Alan,

Sorry, I forgot to "USE: locals".

I'm using "locals" which is a way to use names to refer to stack variables inside of a word definition. I thought the code would be a little easier to understand.

I updated the example. Thanks for the feedback!