A few hours ago, Zed Shaw tweeted about an experiment creating typed netstrings:
An experiment in tagging netstrings with the types of their data as an alternative to JSON: http://codepad.org/xct0E5ac
I thought that an implementation in Factor would contrast nicely with Zed's Python version. The basic idea is to support four kinds of data types:
- Text
- Numbers
- Lists
- Dictionaries (e.g., maps or associations)
First, some imports:
USING: arrays combinators formatting hashtables kernel math.parser sequences splitting ;
An encoded payload value looks something like "{LENGTH}:{PAYLOAD}{TYPE}
". We can write a simple word to parse a string that looks like that into its parts:
: parse-payload ( data -- remain payload payload-type ) ":" split1 swap string>number cut unclip swapd ;
We can build a simple parser for these "typed netstrings" (deferring for the moment, how we parse lists and dictionaries):
DEFER: parse-dict DEFER: parse-list : parse-tnetstring ( data -- remain value ) parse-payload { { CHAR: # [ string>number ] } { CHAR: " [ ] } { CHAR: } [ parse-dict ] } { CHAR: ] [ parse-list ] } [ "Invalid payload type: %c" sprintf throw ] } case ;
Parsing lists is just repeatedly parsing values until the remainder is exhausted:
: parse-list ( data -- value ) [ { } ] [ [ dup empty? not ] [ parse-tnetstring ] produce nip ] if-empty ;
Parsing dictionaries is only a little more involved. We need a way to parse successive key/value pairs, checking some simple error conditions:
: parse-pair ( data -- extra value key ) parse-tnetstring [ dup [ "Unbalanced dictionary store" throw ] unless parse-tnetstring dup [ "Invalid value, null not allowed" throw ] unless ] dip ;
Then we can build the dictionary, repeatedly parsing key/value pairs:
: parse-dict ( data -- value ) [ H{ } ] [ [ dup empty? not ] [ parse-pair swap 2array ] produce nip >hashtable ] if-empty ;
And, to make the interface easy to use, we wrap our parse-tnetstring
word, checking that there was no un-parsed remainder:
: tnetstring ( data -- value ) parse-tnetstring swap [ "Had trailing junk: %s" sprintf throw ] unless-empty ;
We can show that it works on one of Zed's more complex examples:
( scratchpad ) "34:5:hello\"22:11:12345678901#4:this\"]}" tnetstring . H{ { "hello" { 12345678901 "this" } } }
The code (and some tests) for this is on Github.
Update: I added support for booleans and implemented writer words to match the reader words above. Everything is on my Github.
2 comments:
Hi John,
Very new to Factor and really enjoying your blog.
Reading through this code, shouldn't
dup [ "Unbalanced dictionary store" throw ] unless
be
dup empty? [ "Unbalanced dictionary store" throw ] when
in parse-pair? Also, it seems the null check which follows immediately afterwards would only be relevant in the version of the parser extended to support nulls.
Hi Justin!
I think you're right on both counts. On the first, the remain would be empty (not false), so the check should be "when-empty".
Best,
John.
Post a Comment