Saturday, February 28, 2015

Morse Code

A couple days ago, Verizon posted a press release complaining about the FCC's recent changes to Internet regulations. Normally, I wouldn't really bother with things like this, but they posted their statement using morse code. While it would be easy enough to read their English version, I thought it would be fun to decode it using Factor.

This feels a little wonky and a little fragile, but part of that is probably not having higher level words that we can use for moving between parsed HTML and its TEXT representation.

USING: html.parser html.parser.analyzer html.parser.printer
http.client io io.streams.string kernel morse sequences
splitting wrap.strings ;

Step 1. Download the blog post and parse the HTML.

"http://publicpolicy.verizon.com/blog/entry/fccs-throwback-thursday-move-imposes-1930s-rules-on-the-internet"
http-get nip parse-html

Step 2. Extract the morse code text from the post.

"blog" find-by-class-between
"p" find-between-first
[ html-text. ] with-string-writer

Step 3. Split the morse code into words.

"  " split-subseq

Step 4. Parse each word's morse code, joining and wrapping the text.

[ morse> ] map " " join 60 wrap-string print

The result is:

today's decision by the fcc to encumber broadband internet
services with badly antiquated regulations is a radical
step that presages a time of uncertainty for consumers,
innovators and investors. over the past two decades
a bipartisan, light-touch policy approach unleashed
unprecedented investment and enabled the broadband internet
age consumers now enjoy. the fcc today chose to change
the way the commercial internet has operated since its
creation. changing a platform that has been so successful
should be done, if at all, only after careful policy
analysis, full transparency, and by the legislature, which
is constitutionally charged with determining policy. as a
result, it is likely that history will judge today's actions
as misguided. the fcc's move is especially regrettable
because it is wholly unnecessary. the fcc had targeted tools
available to preserve an open internet, but instead chose to
use this order as an excuse to adopt 300-plus pages of broad
and open-ended regulatory arcana that will have unintended
negative consequences for consumers and various parts of the
internet ecosystem for years to come. what has been and will
remain constant before, during and after the existence of
any regulations is verizon's commitment to an open internet
that provides consumers with competitive broadband choices
and internet access when, where, and how they want.

Friday, January 16, 2015

Gödel Numbering

The Programming Praxis blog has a post about Gödel Numbering which I thought would be fun to implement in Factor, showing off our support for working with prime numbers.

Spoiler alert: my solution is below.

Gödel numbering is a system that assigns a natural number to each symbol and expression in a logical system, invented in 1931 by Kurt Gödel for the proofs of his incompleteness theorems. Consider the logical system where symbols are letters and expressions are words; for instance, the word PRAXIS consists of six symbols P, R, A, X, I, and S. Gödel numbering would assign numbers to the letters, say A=1 … Z=26, then assign each letter as the exponent of the next prime, so PRAXIS would be numbered 216 × 318 × 51 × 724 × 119 × 1319 =

83838469305478699942290773046821079668485703984645720358854000640

The process is reversible; factor the Gödel number and decode the exponents.

Your task is to write functions that encode strings as Gödel numbers and decode Gödel numbers as strings.

The math.primes vocabulary provides words for checking if numbers are prime or not and working with various sequences of prime numbers. The math.primes.factors vocabulary provides ways to find the prime factors for a number. Using these, we can simply implement our conversion words:

: >gödel ( str -- n )
    [ length nprimes ] keep [ 64 - ^ ] 2map product ;

: gödel> ( n -- str )
    group-factors values [ 64 + ] "" map-as ;

Thursday, December 18, 2014

Gopher

The Gopher protocol is relatively dated now, but when it was first released in 1991, it had a number of modern features that we would later enjoy in the World Wide Web. In particular, in RFC 1436, it lists these features:

  • A file-like hierarchical arrangement that would be familiar to users.
  • A simple syntax.
  • A system that can be created quickly and inexpensively.
  • Extending the file system metaphor, such as searches.

We're going to build a simple word to let us look through Gopherspace using Factor.

Using the URLs vocabulary, we will build a tool to fetch documents from a Gopher server using a URL that looks like this:

gopher://gopher.floodgap.com/0/gopher/proxy

This specifies a host, an optional port (defaulting to 70 if not specified), and a path which includes an item type and a selector identifying the document to obtain.

Once a network connection is opened, we can retrieve the specified document by sending the selector followed by a CRLF (carriage return and line feed, ASCII bytes 13 and 10 respectively), and then reading the response:

: get-selector ( selector -- document )
   "/" split1 "" or nip write "\r\n" write flush contents ;

: gopher ( url -- document )
    >url {
        [ host>> ]
        [ port>> 70 or <inet> ascii ]
        [ path>> rest ]
    } cleave '[ _ get-selector ] with-client ;

The item type, which we are ignoring in the code above, can be used a bit like a filename extension to handle documents of different types in particular ways. Some common types that you might see:

  • 0 - plain text
  • 1 - menus
  • 9 - binary
  • s - sound
  • g - GIF images

Right now, our code assumes that all the documents we will fetch are ASCII, and it doesn't have any special handling for menus, or support for a query string that would allow using Gopher "search servers". I added some basic support for those items in the new gopher vocabulary that I committed yesterday. In addition, I built a simple Gopher browser complete with history support and ability to view GIF images in the gopher-ui vocabulary.

Here's how you would use it:

IN: scratchpad USE: gopher-ui

IN: scratchpad "gopher://gopher.floodgap.com/1"
               open-gopher-window

That will pop up a window that looks like this, with clickable links and everything:

It's neat using some of these early protocols, both because they tend to be simpler, but especially when you see that they have a passionate following. As of December 2014, Veronica-2 has indexed 150 gopher servers with over 3 million unique selectors. You can see current stats by going to:

gopher://gopher.floodgap.com/0/v2/vstat

Check it out!

Wednesday, December 3, 2014

Binary Puzzle

I've enjoyed being a subscriber to The ListServe, a mailing list where each day one subscriber wins a lottery to write to the entire list of over 24,000 subscribers. There has been a lot of life advice, stories, recipes, music and book recommendations, and even puzzles posted to the list. You can see some of the past posts on The ListServe Blog.

In today's post, someone includes a quick puzzle, which (semi-spoiler alert!) I wanted to show how to solve using Factor:

One preachy puzzle (easy to solve with the aid of the Internet, and therein lies the irony): 01001000 01110101 01101101 01100001 01101110 01110011 00100000 01100001 01110010 01100101 00100000 01101101 01101111 01110010 01100101 00100000 01110100 01101000 01100001 01101110 00100000 01100100 01100001 01110100 01100001

At first glance, it looks like binary numbers separated by spaces in a sequence with some meaning, probably some kind of sentence, probably in English, and probably ASCII encoded.

Let's try and solve it with that in mind:

"""
01001000 01110101 01101101 01100001 01101110 01110011
00100000 01100001 01110010 01100101 00100000 01101101
01101111 01110010 01100101 00100000 01110100 01101000
01100001 01101110 00100000 01100100 01100001 01110100
01100001
"""
[ blank? ] split-when harvest [ bin> ] "" map-as .

It's a neat message, but I won't spoil the answer for you.

Tuesday, December 2, 2014

Heaps

Yesterday, I committed a performance improvement to the heap implementation in Factor.

There's an interesting comment on the pypy implementation of the "heapq" module that discusses a performance optimization that takes advantage of the fact that sub-trees of the heap satisfy the heap invariant. The strategy is to reduce the number of comparisons that take place when sifting items into their proper place in the heap.

Below, I demonstrate the time it takes to run our heaps benchmark and to sort 1 million random numbers using heapsort, before and after making the change.

Before:

IN: scratchpad gc [ heaps-benchmark ] time
Running time: 0.224253523 seconds

IN: scratchpad 1,000,000 random-units gc [ heapsort drop ] time
Running time: 2.210408992 seconds

After:

IN: scratchpad gc [ heaps-benchmark ] time
Running time: 0.172660576 seconds

IN: scratchpad 1,000,000 random-units gc [ heapsort drop ] time
Running time: 1.688299185 seconds

Not a bad improvement!

Friday, November 28, 2014

Prime Sextuplets

A couple of days ago, the Riecoin project (a virtual currency and distributed computing platform) posted a press release announcing they have quietly broken the record for the largest prime number sextuplet:

A prime sextuplet consists of six prime numbers packed together as tightly as possible. For sextuplets, "as tightly as possible" means that the largest is 16 plus the smallest of the numbers.

The smallest prime sextuplet is {7, 11, 13, 17, 19, 23} and generally they take the form of a prime number N such that these six numbers are all prime: {N+0, N+4, N+6, N+10, N+12, N+16}.

It's kind of neat that you can use Factor to confirm their result:

USE: math.primes
689702036532655186685581028503873005405874329363269153979622096014346785019088707220301256048568366498602811964467654774670820091972463194208186476882699386082393716593309811371422836387527549653095824492750394092045532275098135652952423078356472379653908988713872759020566218763497459878106775183203857648413997381256598543877696056491021898353604500233203798629403923570165634119564742536549584121471881689569379964364152289494693118199337926886001843460903637314310532482306798517536171711379098711480663572269535063407688377687623951196977582998449120940358830276897328119483620011984713125859631603652231485340570118364685553782567043880668996080767
{ 0 4 6 10 12 16 } [ + ] with map [ prime? ] all? .

Factor uses an implementation of the probabilistic Miller-Rabin primality test in the math.primes.miller-rabin vocabulary, which on my laptop takes just over 3 seconds.

Saturday, November 22, 2014

Factor Tutorial

Andrea Ferretti has posted a great tutorial about Factor!

From the announcement on the mailing list:

Factor has a lot of documentation in the listener, but I have
tried to cover some topics that are present in the official
docs, but scattered throughout it, so that they were not clear
to me at the beginning.

These include for instance:

- the central concept is function composition, the stack is more
 of a detail
- how simple is to deploy program and scripts
- what tools are there: linter, inspector, unit testing support,
 reverse lookup of function uses...
- what model of multithreading and async I/O are used
- how to make use of multiple cores
- in what sense Factor has an object system
 and more

Check it out!