Brad Fitzpatrick wrote a Go package called
latlong which efficiently maps a
latitude/longitude to a timezone. The original
post
describing it was on Google+ and is likely lost
forever — unless it made it into the Google+
archive before Google+
joined the Google Graveyard.
It tries to have a small binary size (~360 KB), low memory footprint
(~1 MB), and incredibly fast lookups (~0.5 microseconds). It does not
try to be perfectly accurate when very close to borders.
It’s available in other languages, too!
Huon Wilson ported the library to the Rust
Programming Language, making the code available
on GitHub and installable via
Cargo. There is even a wrapper made for
NodeJs that is installable via
NPM that uses a command-line
executable written in Go.
When it was announced in 2015, I had ported the library to
Factor, but missed the opportunity to blog about
it. Below we discuss some details about the implementation, starting with
its use of a shapefile of the TZ timezones of the
world to divide the world into zones that
are assigned timezone values — looking something like this:
The world is divided into 6 zoom levels of tiles (represented by a key and
an index value) that allow us to search from a very large area first, then
down to the more specific geographic area. Note: we represent the struct as
a big endian struct with
structure packing to minimize
wasted space in the files.
The zoom levels are then cached using literal
syntax into a
zoom-levels
constant.
BE-PACKED-STRUCT: tile
{ key uint }
{ idx ushort } ;
SPECIALIZED-ARRAY: tile
CONSTANT: zoom-levels $[
6 <iota> [
number>string
"vocab:geo-tz/zoom" ".dat" surround
binary file-contents tile cast-array
] map
]
Each of the zoom levels reference indexes into a leaves data structure
that contains 14,110 items — each represented by one of three data types:
- Type
S
is a string.
- Type
2
is a one bit tile.
- Type
P
is a pixmap thats 128 bytes long.
These we load and cache into a unique-leaves
constant.
CONSTANT: #leaves 14110
BE-PACKED-STRUCT: one-bit-tile
{ idx0 ushort }
{ idx1 ushort }
{ bits ulonglong } ;
CONSTANT: unique-leaves $[
"vocab:geo-tz/leaves.dat" binary [
#leaves [
read1 {
{ CHAR: S [ { 0 } read-until drop utf8 decode ] }
{ CHAR: 2 [ one-bit-tile read-struct ] }
{ CHAR: P [ 128 read ] }
} case
] replicate
] with-file-reader
]
The core logic involves looking up a leaf (which is one of three types,
loaded above), given an (x, y)
coordinate. If it is a string
type,
we are done. If it is a one-bit-tile
, we defer to the appropriate leaf
specified by idx0
or idx1
. And if it is pixmap, we have a smidge
more logic to detect oceans or defer again to a different leaf.
CONSTANT: ocean-index 0xffff
GENERIC#: lookup-leaf 2 ( leaf x y -- zone/f )
M: string lookup-leaf 2drop ;
M:: one-bit-tile lookup-leaf ( leaf x y -- zone/f )
leaf bits>> y 3 bits 3 shift x 3 bits bitor bit?
[ leaf idx1>> ] [ leaf idx0>> ] if
unique-leaves nth x y lookup-leaf ;
M:: byte-array lookup-leaf ( leaf x y -- zone/f )
y 3 bits 3 shift x 3 bits bitor 2 * :> i
i leaf nth 8 shift i 1 + leaf nth +
dup ocean-index = [ drop f ] [
unique-leaves nth x y lookup-leaf
] if ;
We’re almost done! Given a zoom level, a tile-key
helps us find a
specific tile that we then can lookup the leaf for, hopefully finding the
timezone
associated with the coordinate.
:: lookup-zoom-level ( zoom-level x y tile-key -- zone/f )
zoom-level [ key>> tile-key >=< ] search swap [
dup key>> tile-key = [
idx>> unique-leaves nth x y lookup-leaf
] [ drop f ] if
] [ drop f ] if ;
Each coordinate is effectively a
pixel in the image, so our logic
searches from the outermost zoom level to the innermost, trying to lookup
a timezone in each one using the coordinate and level as a tile-key
.
:: tile-key ( x y level -- tile-key )
level dup 3 + neg :> n
y x [ n shift 14 bits ] bi@
{ 0 14 28 } bitfield ;
:: lookup-pixel ( x y -- zone )
6 <iota> [| level |
level zoom-levels nth
x y 2dup level tile-key
lookup-zoom-level
] map-find-last drop ;
Finally, we have enough to implement our public API, converting a given
latitude/longitude coordinate to a pixel value, deferring to the word we
just defined above.
CONSTANT: deg-pixels 32
:: lookup-zone ( lat lon -- zone )
lon 180 + deg-pixels * 0 360 deg-pixels * 1 - clamp
90 lat - deg-pixels * 0 180 deg-pixels * 1 - clamp
[ >integer ] bi@ lookup-pixel ;
And then a couple of test cases to show it’s working:
{ "America/Los_Angeles" } [ 37.7833 -122.4167 lookup-zone ] unit-test
{ "Australia/Sydney" } [ -33.8885 151.1908 lookup-zone ] unit-test
Performance is pretty good, we can generate over 3 million lookups per
second, putting our cost per lookup around 0.33 microseconds. And all of
that in less than 70 lines of code.
This is available on my
GitHub.