Thursday, October 23, 2014

cURL

The cURL project is a command-line tool and library for transferring data using URL syntax supporting many (many!) protocols. I recently contributed a simple wrapper for libcurl to Factor and wanted to show a little bit about how it was made.

We have a download-to word that uses our HTTP client to download resources from the web. I wanted to show how to build a similar word to download resources using libcurl.

FFI

We will use the alien vocabulary to interface with the libcurl C library, defining words to initialize, perform a request, and cleanup

TYPEDEF: void CURL

FUNCTION: CURL* curl_easy_init ( ) ;

FUNCTION: int curl_easy_perform ( CURL* curl ) ;

FUNCTION: void curl_easy_cleanup ( CURL* curl ) ;

Before we perform the request, we will want to set various options to control what request is made, using function aliases to allow passing different types of values based on the numeric key:

FUNCTION-ALIAS: curl_easy_setopt_long
int curl_easy_setopt ( CURL* curl, int option, long value ) ;

FUNCTION-ALIAS: curl_easy_setopt_string
int curl_easy_setopt ( CURL* curl, int option, c-string value ) 

FUNCTION-ALIAS: curl_easy_setopt_pointer
int curl_easy_setopt ( CURL* curl, int option, void* value ) ;

TYPEDEF: int64_t curl_off_t

FUNCTION-ALIAS: curl_easy_setopt_curl_off_t
int curl_easy_setopt ( CURL* curl, int option, curl_off_t value ) ;

: curl_easy_setopt ( curl option value -- code )
    over enum>number {
        { [ dup 30000 > ] [ drop curl_easy_setopt_curl_off_t ] }
        { [ dup 20000 > ] [ drop curl_easy_setopt_pointer ] }
        { [ dup 10000 > ] [ drop curl_easy_setopt_string ] }
        [ drop curl_easy_setopt_long ]
    } cond ;

Factor

We can then begin to use libcurl in a few simple Factor words that allow us to present a nice interface to the user. Starting with initializing the library, and registering a destructor the cleanup after we are done:

DESTRUCTOR: curl_easy_cleanup

: curl-init ( -- CURL )
    curl_easy_init &curl_easy_cleanup ;

Some of the functions produce an error code that we should check.

CONSTANT: CURLE_OK 0

: check-code ( code -- )
    CURLE_OK assert= ;

We can set options using the curl_easy_setopt words we defined earlier:

: curl-set-opt ( CURL key value -- )
    curl_easy_setopt check-code ;

Using these we can set file (opening and registering a destructor to close) and URL options:

CONSTANT: CURLOPT_FILE 10001
CONSTANT: CURLOPT_URL 10002

DESTRUCTOR: fclose

: curl-set-file ( CURL path -- )
    CURLOPT_FILE swap "wb" fopen &fclose curl-set-opt ;

: curl-set-url ( CURL url -- )
    CURLOPT_URL swap present curl-set-opt ;

And a word to perform the "curl":

: curl-perform ( CURL -- )
    curl_easy_perform check-code ;

Putting all of that together, we can finally download a URL to a specified local file path:

: curl-download-to ( url path -- )
    [
        curl-init
        [ swap curl-set-file ]
        [ swap curl-set-url ]
        [ curl-perform ] tri
    ] with-destructors ;

Using it is pretty simple:

IN: scratchpad "http://factorcode.org" "/tmp/factor.html"
               curl-download-to