A few days ago, Ciprian Dorin Craciun wrote a binary to text encoding blog post about the "state of the art and missed opportunities" in various encoding schemes. In that post, I was introduced to the Proquint encoding which stands for "PRO-nouncable QUINT-uplets".
In the Factor programming language, we have enjoyed implementing many encoding/decoding methods including: base16, base24, base32, base32hex, base32-crockford, base36, base58, base62, base64, base85, base91, uu, and many others. I thought it would be fun to add a quick implementation of Proquint.
Like other encodings, it makes use of an alphabet — grouped as consonants and vowels:
CONSTANT: consonant "bdfghjklmnprstvz" CONSTANT: vowel "aiou"
Numbers are grouped into 5-character blocks representing a 16-bit number, with alternating consonants representing 4 bits and vowels representing 2 bits:
: >quint16 ( m -- str ) 5 [ even? [ [ -4 shift ] [ 4 bits consonant nth ] bi ] [ [ -2 shift ] [ 2 bits vowel nth ] bi ] if ] "" map-integers-as reverse nip ;
Encoding a 32-bit number is made by joining two 16-bit blocks:
: >quint32 ( m -- str ) [ -16 shift ] keep [ 16 bits >quint16 ] bi@ "-" glue ;
Decoding numbers looks up each consonant or vowel, skipping separators:
: quint> ( str -- m ) 0 [ dup $[ consonant alphabet-inverse ] nth [ nip [ 4 shift ] [ + ] bi* ] [ dup $[ vowel alphabet-inverse ] nth [ nip [ 2 shift ] [ + ] bi* ] [ CHAR: - assert= ] if* ] if* ] reduce ;
We can use this to make a random password that might be more memorable — but perhaps more secure if using more random-bits:
: quint-password ( -- quint ) 32 random-bits >quint32 ;
And we could use our ip-parser vocabulary to make IPv4 addresses more memorable:
: ipv4>quint ( ipv4 -- str ) ipv4-aton >quint32 ; : quint>ipv4 ( str -- ipv4 ) quint> ipv4-ntoa ;
You can see how this might work by building a test suite to show roundtrips work:
{ t } [ { { "127.0.0.1" "lusab-babad" } { "63.84.220.193" "gutih-tugad" } { "63.118.7.35" "gutuk-bisog" } { "140.98.193.141" "mudof-sakat" } { "64.255.6.200" "haguz-biram" } { "128.30.52.45" "mabiv-gibot" } { "147.67.119.2" "natag-lisaf" } { "212.58.253.68" "tibup-zujah" } { "216.35.68.215" "tobog-higil" } { "216.68.232.21" "todah-vobij" } { "198.81.129.136" "sinid-makam" } { "12.110.110.204" "budov-kuras" } } [ [ quint>ipv4 = ] [ swap ipv4>quint = ] 2bi and ] assoc-all? ] unit-test
This is now available as the proquint vocabulary in a recent nightly build.
No comments:
Post a Comment