Friday, March 3, 2023

Short UUID

The shortuuid project is a “simple python library that generates concise, unambiguous, URL-safe UUIDs”. I thought it would be a fun exercise to implement this in Factor.

What is a “short UUID”?

You can read the original announcement, but basically it is a string representation of a number using a reduced alphabet that can be used in places like URLs where conciseness is desirable. The author mentions that it provides security by “not divulging information (such as how many rows there are in that particular table, the time difference between one item and the next, etc.)”. However, I think it is more security through obscurity than real security.

In any event, the alphabet used are these 57 characters:

CONSTANT: alphabet
"23456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz"

We encode a numeric input by repeatedly “divmod”, indexing into an alphabet, until exhausted.

: encode-uuid ( uuid -- shortuuid )
    [ dup 0 > ] [
        alphabet [ length /mod ] [ nth ] bi
    ] "" produce-as nip reverse ;

We decode using a reverse process, looking up the position of each character in the alphabet, re-assembling the numeric input for each character in the shortuuid.

: decode-uuid ( shortuuid -- uuid )
    0 [
        alphabet index [ alphabet length * ] dip +
    ] reduce ;

This is available on my GitHub, including features to deal with legacy values generated before version 1.0.0 as well as supporting different alphabets being used.

No comments: