A few hours ago, Zed Shaw tweeted about an experiment creating typed netstrings:
An experiment in tagging netstrings with the types of their data as an alternative to JSON: http://codepad.org/xct0E5ac
I thought that an implementation in Factor would contrast nicely with Zed's Python version. The basic idea is to support four kinds of data types:
- Text
- Numbers
- Lists
- Dictionaries (e.g., maps or associations)
First, some imports:
USING: arrays combinators formatting hashtables kernel math.parser sequences splitting ;
An encoded payload value looks something like "{LENGTH}:{PAYLOAD}{TYPE}
". We can write a simple word to parse a string that looks like that into its parts:
: parse-payload ( data -- remain payload payload-type ) ":" split1 swap string>number cut unclip swapd ;
We can build a simple parser for these "typed netstrings" (deferring for the moment, how we parse lists and dictionaries):
DEFER: parse-dict DEFER: parse-list : parse-tnetstring ( data -- remain value ) parse-payload { { CHAR: # [ string>number ] } { CHAR: " [ ] } { CHAR: } [ parse-dict ] } { CHAR: ] [ parse-list ] } [ "Invalid payload type: %c" sprintf throw ] } case ;
Parsing lists is just repeatedly parsing values until the remainder is exhausted:
: parse-list ( data -- value ) [ { } ] [ [ dup empty? not ] [ parse-tnetstring ] produce nip ] if-empty ;
Parsing dictionaries is only a little more involved. We need a way to parse successive key/value pairs, checking some simple error conditions:
: parse-pair ( data -- extra value key ) parse-tnetstring [ dup [ "Unbalanced dictionary store" throw ] unless parse-tnetstring dup [ "Invalid value, null not allowed" throw ] unless ] dip ;
Then we can build the dictionary, repeatedly parsing key/value pairs:
: parse-dict ( data -- value ) [ H{ } ] [ [ dup empty? not ] [ parse-pair swap 2array ] produce nip >hashtable ] if-empty ;
And, to make the interface easy to use, we wrap our parse-tnetstring
word, checking that there was no un-parsed remainder:
: tnetstring ( data -- value ) parse-tnetstring swap [ "Had trailing junk: %s" sprintf throw ] unless-empty ;
We can show that it works on one of Zed's more complex examples:
( scratchpad ) "34:5:hello\"22:11:12345678901#4:this\"]}" tnetstring . H{ { "hello" { 12345678901 "this" } } }
The code (and some tests) for this is on Github.
Update: I added support for booleans and implemented writer words to match the reader words above. Everything is on my Github.
Hi John,
ReplyDeleteVery new to Factor and really enjoying your blog.
Reading through this code, shouldn't
dup [ "Unbalanced dictionary store" throw ] unless
be
dup empty? [ "Unbalanced dictionary store" throw ] when
in parse-pair? Also, it seems the null check which follows immediately afterwards would only be relevant in the version of the parser extended to support nulls.
Hi Justin!
ReplyDeleteI think you're right on both counts. On the first, the remain would be empty (not false), so the check should be "when-empty".
Best,
John.