Re: Factor: February 2011

Monday, February 21, 2011

Gestalt

Sometimes it is useful to lookup the operating system version that your code is running on. On the Mac OS X, this information can be retrieved in several ways. We will be adding Factor support for using the Gestalt function (available as part of the Gestalt Manager in the CoreServices.framework since Mac OS X 10.0).

First, we create a namespace and list of vocabularies we will be using:

USING: alien.data alien.syntax combinators core-foundation
formatting io.binary kernel math ;

IN: gestalt

Using Factor's C library interface, we can declare the function and its types:

TYPEDEF: SInt16 OSErr

TYPEDEF: UInt32 OSType

FUNCTION: OSErr Gestalt ( OSType selector, SInt32* response ) ;

Using this, we can create the gestalt word, which calls the function, checks for errors, and returns the result:

: gestalt ( selector -- response )
    { SInt32 } [ Gestalt 0 assert= ] with-out-parameters ;

Looking at Gestalt.h, we can see several enum definitions that can be used to retrieve system version information:

enum {
  gestaltSystemVersion          = 'sysv',
  gestaltSystemVersionMajor     = 'sys1',
  gestaltSystemVersionMinor     = 'sys2',
  gestaltSystemVersionBugFix    = 'sys3'
};

Knowing these, we can create words for accessing the various system versions exposed by Gestalt.

: system-version ( -- n ) "sysv" be> gestalt ;

: system-version-major ( -- n ) "sys1" be> gestalt ;

: system-version-minor ( -- n ) "sys2" be> gestalt ;

: system-version-bugfix ( -- n ) "sys3" be> gestalt ;

However, we see a comment in Gestalt.h which says:

If the values of the minor or bug fix revision are larger than 9, then gestaltSystemVersion will substitute the value 9 for them. For example, Mac OS X 10.3.15 will be returned as 0x1039, and Mac OS X 10.10.5 will return 0x1095. A better way to get version information on Mac OS X would be to use the new gestaltSystemVersionMajor, gestaltSystemVersionMinor, and gestaltSystemVersionBugFix selectors, which don't have arbitrary limits on the values returned.

With this limitation, we could still use the system-version to retrieve Apple's "code name" for the version of Mac OS X you are running (since it works for all of the released Mac OS X versions):

: system-code-name ( -- str )
    system-version HEX: FFF0 bitand {
        { HEX: 1070 [ "Lion"         ] }
        { HEX: 1060 [ "Snow Leopard" ] }
        { HEX: 1050 [ "Leopard"      ] }
        { HEX: 1040 [ "Tiger"        ] }
        { HEX: 1030 [ "Panther"      ] }
        { HEX: 1020 [ "Jaguar"       ] }
        { HEX: 1010 [ "Puma"         ] }
        { HEX: 1000 [ "Cheetah"      ] }
        [ drop "Unknown" ]
    } case ;

And then, we can make a version string using each of the major, minor, and bugfix values:

: system-version-string ( -- str )
    system-version-major
    system-version-minor
    system-version-bugfix
    "%s.%s.%s" sprintf ;

Using this, we can see which version of Mac OS X is running on my laptop:

( scratchpad ) system-version-string system-code-name 
              "%s (%s)\n" printf
10.6.6 (Snow Leopard)

By the way, another comment in Gestalt.h says:

If you want to know the product build version string, product name, or the user visible version string you should read in the system version information from the file /System/Library/CoreServices/SystemVersion.plist.

Using the cocoa.plist vocabulary, we can do just that:

( scratchpad ) USE: cocoa.plist

( scratchpad ) "/System/Library/CoreServices/SystemVersion.plist"
               read-plist .
H{
    { "ProductVersion" "10.6.6" }
    { "ProductName" "Mac OS X" }
    { "ProductBuildVersion" "10J567" }
    { "ProductUserVisibleVersion" "10.6.6" }
    { "ProductCopyright" "1983-2011 Apple Inc." }
}

The code for this is on my Github.

Thursday, February 17, 2011

Port Knocking

As a followup to my last article, I wanted to show how to implement port knocking functionality in Factor. This was particularly interesting to me since I ran across a recent blog post about the "inventor" of port knocking.

"port knocking is a method of externally opening ports on a firewall by generating a connection attempt on a set of prespecified closed ports"

There are several types of port knocking, with variations using any combination of TCP, UDP, ICMP, or similar network protocols. For this implementation, we will choose to use TCP. In particular, using the open-port? word from our port-scan vocabulary. As a reminder, it looks like this:

USING: continuations io.encodings.binary io.sockets kernel ;

: open-port? ( host port -- ? )
    <inet> [ binary [ t ] with-client ] [ 2drop f ] recover ;

The idea is to knock on a sequence of ports, in order. In its simplest form, we attempt a connection to each port, ignoring whether it is open or closed (not relevant).

: knock-ports ( host ports -- )
    [ dupd open-port? drop ] each drop ;

You can imagine using it something like this (knocking on ports 10000, 10001, 10002 and then write a "Hello" message to port 12345):

( scratchpad ) "localhost"
               [ { 10000 10001 10002 } knock-ports ]
               [ 12345 <inet> ascii [ "Hello" print ] with-client ]
               bi

Some ways to improve this might be to allow "knock delay" between successive knocks, support UDP or ICMP knocks, or perhaps to encompass this functionality transparently into a reusable with-knocking-client word.

Tuesday, February 15, 2011

Port Scanner

For fun, I thought I'd show how to build a simple port scanner using Factor

We start by listing vocabularies that we will be using and a namespace for our work:

USING: continuations formatting kernel math.ranges
io.encodings.binary io.sockets make sequences ;

IN: port-scan

We need to decide how to check for an open port. In this case, we are going to attempt to establish a TCP connection. If it succeeds, then the port is open, otherwise it will throw an error connecting and we will assume the port was not open.

: open-port? ( host port -- ? )
    <inet> [ binary [ t ] with-client ] [ 2drop f ] recover ;

Note: we should be setting a connection timeout, so that we do not let a connection attempt last forever. However, I'm not quite sure how to do that -- the documentation for io.sockets and io.timeouts didn't make it obvious.

Next, we will make a word that returns an array of all open ports (checking ports from 1 to 1024). We use the make vocabulary to build the sequence dynamically.

: open-ports ( host -- seq )
    1024 [1,b] [
        [ 2dup open-port? [ , ] [ drop ] if ] each drop
    ] { } make ;

Finally, we can make a word that provides some visual output back to the user:

: scan-ports ( host -- )
    [ "Scanning %s...\n" printf ]
    [ open-ports [ "%d is open\n" printf ] each ]
    bi ;

Using this on my laptop returns the following:

( scratchpad ) "127.0.0.1" scan-ports
Scanning 127.0.0.1...
631 is open

It is quite simple and functional as is. However, some obvious improvements could be made:

adding the connection timeout as mentioned above
providing the output of scan-ports to the user as open ports are found
using the concurrent combinators to test ports in parallel
using a list of port numbers to identify services that might be on open ports

The code for this is on my Github.

Saturday, February 12, 2011

Simple RPG

There was a post a few days ago that discussed "fluent interfaces" using a simple role-playing game as an example. Since Factor, as a concatenative language, can be quite "fluent", I decided to port the original C# version into Factor to see how it looks.

The basic "simple RPG" game starts with the creation of a hero, who encounters and does battle with various randomly generated enemies. It is text-based and looks something like this:

= Starting Battle =
Name: Valient
Class: fighter
HP: 20/20
Age: 22
Str 18 / Agi 14 / Int 12
Gold: 0

vs.

Name: Destroicous
Class: fighter
HP: 8/8
Age: 107
Str 7 / Agi 6 / Int 18
Gold: 42

An enemy approaches> 


Valient (20/20) / Destroicous (8/8)
Valient poisons Destroicous for 6 damage!
Destroicous incinerates Valient for 3 damage!

Some vocabularies we will be using:

USING: accessors combinators formatting io kernel locals math
math.ranges random sequences ;

Object Creation

The original article focuses mainly on object creation, which I'll only mention in passing. The "fluent" version in C# looks like this:

characterBuilder.Create("King")
   .As(ClassType.Fighter)
   .WithAge(49)
   .HP(50)
   .Strength(17)
   .Agility(12)
   .Intelligence(15)
   .Gold(9999999);

Translating that directly to Factor (using slot accessors to construct objects), would look something like this :

character new
    "King" >>name
    "fighter" >>class
    49 >>age
    50 >>hp
    17 >>strength
    12 >>agility
    15 >>intelligence
    9999999 >>gold

The Character

We define a character type with some basic fields (using short names that are common among RPG programmers) to represent either our hero, or his enemies:

TUPLE: character name class age str agi int gold hp max-hp ;

We will list several possible character classes (although the main logic will not take these into account):

CONSTANT: classes {
    "fighter"
    "mage"
    "cleric"
    "rogue"
}

A word to check if a character is alive:

: alive? ( character -- ? ) hp>> 0 > ;

We can print out our character's full stats:

: full-stats ( character -- )
    {
        [ name>> "Name: %s\n" printf ]
        [ class>> "Class: %s\n" printf ]
        [ [ hp>> ] [ max-hp>> ] bi "HP: %d/%d\n" printf ]
        [ age>> "Age: %d\n" printf ]
        [
            [ str>> ] [ agi>> ] [ int>> ] tri
            "Str %d / Agi %d / Int %d\n" printf
        ]
        [ gold>> "Gold: %d\n" printf ]
    } cleave ;

Or print just a "quick" version:

: quick-stats ( character -- )
    [ name>> ] [ hp>> ] [ max-hp>> ] tri "%s (%d/%d)" printf ;

The Battle

Note: Our battle logic is implemented with locals to try and match the C# version closely.

We support some random attack types:

CONSTANT: attack-verbs {
    "slashes"
    "stabs"
    "smashes"
    "impales"
    "poisons"
    "shoots"
    "incinerates"
    "destroys"
}

The attack logic is very simple - a random amount of damage using a random attack type:

:: attack ( attacker defender -- )
    10 random :> damage
    attacker name>>
    attack-verbs random
    defender [ damage - ] change-hp name>>
    damage
    "%s %s %s for %d damage!\n" printf ;

The main battle logic starts a battle, loops performing a "fight to the death", and then declares our hero as victor or victim:

:: battle ( hero enemy -- )
    "= Starting Battle =" print
    hero full-stats nl
    "vs." print nl
    enemy full-stats nl
    "An enemy approaches> " write read1 drop nl nl

    [ hero alive? enemy alive? and ] [
        hero quick-stats " / " write enemy quick-stats nl
        hero enemy attack
        enemy hero attack
        hero alive? [ "> " write read1 drop ] when nl
    ] while

    hero alive? [
        "Our hero survives to fight another battle! " write
        enemy gold>> "Won %d gold!\n" printf
        hero [ enemy gold>> + ] change-gold drop
    ] [
        hero gold>> "Our hero has fallen with %d gold! " printf
        "The world is covered in darkness once again." print
    ] if nl ;

The Game

We create "Valient", our hero:

: <hero> ( -- character )
    character new
        "Valient" >>name
        "fighter" >>class
        22 >>age
        20 [ >>hp ] [ >>max-hp ] bi
        18 >>str
        14 >>agi
        12 >>int
        0 >>gold ;

Some logic to create random enemy names:

CONSTANT: first-names {
    "Destro"
    "Victo"
    "Mozri"
    "Fang"
    "Ovi"
    "Hell"
    "Syth"
    "End"
}

CONSTANT: last-names {
    "math"
    "rin"
    "sith"
    "icous"
    "ravage"
    "wrath"
    "ryn"
    "less"
}

: random-name ( -- str )
    first-names last-names [ random ] bi@ append ;

And finally, create a random enemy to fight:

: <enemy> ( -- character )
    character new
        random-name >>name
        classes random >>class
        12 200 [a,b] random >>age
        5 12 [a,b] random [ >>hp ] [ >>max-hp ] bi
        21 [1,b) random >>str
        21 [1,b) random >>agi
        21 [1,b) random >>int
        50 random >>gold ;

Using these words, and the battle logic created earlier, we can run the entire game:

: run-battle ( -- )
    <hero> [ dup alive? ] [ dup <enemy> battle ] while drop ;

The code for this is on my Github.

Thursday, February 10, 2011

Ping? Pong!

A few months ago, I implemented the "ping" utility in Factor. Doug Coleman helped get it working on Windows. Below I'm going to describe how it works.

First, we needed to add ICMP support in Factor (in the io.sockets.icmp vocabulary). Some things this required:

Previously, Factor had inet4 and inet6 types representing a "host/port" tuple in IPv4 and IPv6, respectively. ICMP is similar in that it requires a "host", but the "port" is unnecessary. With Slava's help, we factored out ipv4 and ipv6 types from inet4 and inet6 and used them to create the icmp4 and icmp6 address types.
We needed a generic word protocol that can be used to specify the correct protocol value to be used when creating a socket.
We also needed an implementation of the internet checksum, which I had previously provided (in the checksums.internet vocabulary).

The ping implementation starts with imports and a namespace:

USING: accessors byte-arrays calendar checksums
checksums.internet combinators combinators.smart continuations
destructors io.sockets io.sockets.icmp io.timeouts kernel
locals pack random sequences system ;

IN: ping

In RFC 792, the ICMP "Echo" and "Echo Reply" packets are described. Both have the same form (except that "Echo" is type 8 and "Echo Reply" is type 0). We define an echo tuple to represent both types. The <echo> constructor is then used to create an "Echo" (type 8) with a random identifier (some ping implementations use the process id instead).

TUPLE: echo type identifier sequence data ;

: <echo> ( sequence data -- echo )
    [ 8 16 random-bits ] 2dip echo boa ;

Since both echo types have the same form, so we can use the packet description in the RFC to create words to convert between echo's and byte-array's. We use the internet checksum when encoding and before decoding to verify the response.

: echo>byte-array ( echo -- byte-array )
    [
        [
            [ type>> 0 0 ] ! code checksum
            [ identifier>> ]
            [ sequence>> ] tri
        ] output>array "CCSSS" pack-be
    ] [ data>> ] bi append [
        internet checksum-bytes 2 4
    ] keep replace-slice ;

: byte-array>echo ( byte-array -- echo )
    dup internet checksum-bytes B{ 0 0 } assert=
    8 cut [
        "CCSSS" unpack-be { 0 3 4 } swap nths first3
    ] dip echo boa ;

Sending a ping is just creating an echo request, encoding it into a byte-array and sending it to the specified address.

: send-ping ( addr raw -- )
    [ 0 { } <echo> echo>byte-array ] 2dip send ;

Note: We should be incrementing the sequence properly, instead of always sending zero here -- and then using it to verify the reply packets.

Receiving a ping is just reading packets until we have one from the specified address.

:: recv-ping ( addr raw -- echo )
    raw receive addr = [
        20 tail byte-array>echo
    ] [
        drop addr raw recv-ping
    ] if ;

Note: There's a subtle bug where we set a "read timeout" on the socket, but if we keep getting packets from the wrong IP, then we will loop without timing out properly.

Normally ICMP can only be used with "raw" sockets which require root (administrative) privileges to create. This is often implemented by setting the setuid flag on the ping executable. However, on BSD systems (like Mac OS X), it is a little different. Running "man 4 icmp" shows you something called "Non-privileged ICMP" which allows you to create an ICMP socket using the "datagram" socket (e.g., SOCK_DGRAM). These sockets only support a limited subset of ICMP, but it is sufficient for sending echo requests and receiving echo replies.

HOOK: <ping-port> os ( inet -- port )

M: object <ping-port> <raw> ;

M: macosx <ping-port> <datagram> ;

Putting this all together, we can implement a ping word that looks up an IPv4 address for the specified hostname, and then sends a ping, using io.timeouts to wait up to one second for the response.

: ping ( host -- reply )
    <icmp> resolve-host [ icmp4? ] filter random
    f <icmp4> <ping-port>
        1 seconds over set-timeout
    [ [ send-ping ] [ recv-ping ] 2bi ] with-disposal ;

For convenience, we make a word to ping the IPv4 localhost address.

: local-ping ( -- reply )
    "127.0.0.1" ping ;

And a word that just checks to see if a host is alive, returning t (true) or f (false).

: alive? ( host -- ? )
    [ ping drop t ] [ 2drop f ] recover ;

This code is in extra/ping in the Factor repository. Currently, it supports only IPv4 addresses, but could be modified to support IPv6 (using RFC 2463 which requires some modifications to use a "pseudo-header" in the checksum calculation).

Friday, February 4, 2011

Maximum/Minimum

For some Factor code I was writing recently, I needed to find the "maximum" and "minimum" of some sequences of values. The "obvious" way to do this is to use the supremum and infimum words from the sequences vocabulary. Those words were not quite what I wanted, so I thought I would write some thoughts about it:

First, an example to show how they work:

( scratchpad ) { 1 2 3 4 } supremum .
4

( scratchpad ) { 1 2 3 4 } infimum .
1

Let's say we create a person tuple with a name and age.

TUPLE: person name age ;

Now, let's say we have a list of people (e.g., a sequence of person objects):

CONSTANT: PEOPLE {
    T{ person f "Jim" 30 }
    T{ person f "Sally" 27 }
    T{ person f "Rebecca" 32 }
    T{ person f "James" 28 }
    T{ person f "Benjamin" 22 }
}

What if we'd like to find out who the oldest person is? We could try using the supremum word:

( scratchpad ) PEOPLE supremum
Generic word <=> does not define a method for the person class.
Dispatching on object: T{ person f "Sally" 27 }

Whoops! The supremum word uses the max generic word, which in turn defaults to using the after? generic word, which in turn defaults to using the <=> generic word (from the math.order vocabulary) to compare two values for +lt+, +eq+, or +gt+.

So, let's define that word for the person types:

M: person <=> [ age>> ] bi@ <=> ;

Okay, let's try again:

( scratchpad ) PEOPLE supremum .
T{ person f "Rebecca" 32 }

Alright, now we also want to find the person with the shortest name. Well, we could use infimum, but it, like supremum, delegates to <=> which (from our previous definition) currently orders by a person's age. Let's redefine <=> to compare based on length of a person's name:

M: person <=> [ name>> length ] bi@ <=> ;

Okay, let's try it now:

( scratchpad ) PEOPLE infimum .
T{ person f "Jim" 30 }

It worked! But, this is quite awkward to change the definition of <=> for each purpose. The problem here is that supremum and infimum (and the <=> word they use) expect to be used for the natural ordering of objects, not for arbitrary orderings.

Instead, we can define two new words, max-by and min-by, that use a provided quotation to obtain a value that is then used to compare two objects (e.g., in our first example, age, and in our second, the length of name).

: max-by ( obj1 obj2 quot: ( obj -- n ) -- obj1/obj2 )
    [ bi@ [ max ] keep eq? not ] curry most ; inline

: min-by ( obj1 obj2 quot: ( obj -- n ) -- obj1/obj2 )
    [ bi@ [ min ] keep eq? not ] curry most ; inline

With these, we can define the words, maximum and minimum, which use these to retrieve the object with "maximum" or "minimum" characteristics (using the specified quotation).

: maximum ( seq quot: ( ... elt -- ... n ) -- elt )
    [ keep 2array ] curry
    [ [ first ] max-by ] map-reduce second ; inline

: minimum ( seq quot: ( ... elt -- ... n ) -- elt )
    [ keep 2array ] curry
    [ [ first ] min-by ] map-reduce second ; inline

Using these, we can easily find our answers:

( scratchpad ) PEOPLE [ age>> ] maximum .
T{ person f "Rebecca" 32 }

( scratchpad ) PEOPLE [ name>> length ] minimum .
T{ person f "Jim" 30 }

It's worth noting that we could define both supremum and infimum in terms of our new words:

: supremum ( seq -- elt ) [ ] maximum ;

: infimum ( seq -- elt ) [ ] minimum ;

But, because of the wrapping and unwrapping that we do inside of maximum and minimum, it's a little less efficient than the current implementation.