Monday, September 23, 2013

YouTube Downloader

YouTube needs no introduction - it hosts many videos and has a virtual firehose of new content uploaded every second of every day. The primary interface is through a web browser with no easy way to download content for offline viewing. Naturally, "YouTube downloader" plugins have become a popular addition to most web browsers.

I thought it would be fun to implement a basic "YouTube downloader" in Factor.

First, we start by obtaining information about a video using the YouTube API, parsing the response as a query string:

CONSTANT: video-info-url URL""

: get-video-info ( video-id -- video-info )
    video-info-url clone
        3 "asv" set-query-param
        "detailpage" "el" set-query-param
        "en_US" "hl" set-query-param
        swap "video_id" set-query-param
    http-get nip query>assoc ;

Next, we can get a list of the available video formats:

: video-formats ( video-info -- video-formats )
    "url_encoded_fmt_stream_map" of
    "," split [ query>assoc ] map ;

A particular video format includes a download URL and a signature that needs to be attached to it to successfully download the video:

: video-download-url ( video-format -- url )
    [ "url" of ] [ "sig" of ] bi "&signature=" glue ;

We are going to use the video title to create a filename, but first we want to sanitize it by removing unprintable characters and a few characters which might conflict with your filesystem and making it no more than 200 characters long:

: sanitize ( title -- title' )
    [ 0 31 between? not ] filter
    [ "\"#$%'*,./:;<>?^|~\\" member? not ] filter
    200 short head ;

Finally, to download a video, we lookup its video info, find the first mp4 formatted video, convert it to a download URL, and then download-to a file.

: download-video ( video-id -- )
    get-video-info [
        video-formats [ "type" of "video/mp4" head? ] find nip
    ] [
        "title" of sanitize ".mp4" append download-to
    ] bi ;

You can choose a directory to download it to using the with-directory word. For example, downloading a video to your home directory:

IN: scratchpad "~" [ "G8LC8ES6ogw" download-video ] with-directory

The code for this is on my GitHub.

No comments: