I thought it would be fun to track the development of the Factor programming language using Reddit. Of course, to do this, I wanted to use Factor.
A few months ago, I implemented a Reddit "Top" program. We can use the code for that as a basis for showing, using Reddit scores, how the activity (or popularity?) of a couple Factor blogs have changed over the last few years:
Stories
We start by creating a story
tuple that will hold all of the properties that Reddit returns for each posted link (including the score -- up votes minus down votes -- which we will be using).
TUPLE: story author clicked created created_utc domain downs hidden id is_self levenshtein likes media media_embed name num_comments over_18 permalink saved score selftext selftext_html subreddit subreddit_id thumbnail title ups url ;
And a word to parse a "data" result into a story object:
: parse-story ( assoc -- obj ) "data" swap at \ story from-slots ;
Paging
The Reddit API provides results as a series of "pages" (until all results are exhausted). This is pretty similar to how the website works, so it should be fairly easy to understand the mechanics. We will define a page
object that holds the current URL, the results from the current page, and links to the pages before and after the current page:
TUPLE: page url results before after ;
We can then build a word to fetch the page (as a JSON response), parse it, and construct a page
object:
: json-page ( url -- page ) >url dup http-get nip json> "data" swap at { [ "children" swap at [ parse-story ] map ] [ "before" swap at [ f ] when-json-null ] [ "after" swap at [ f ] when-json-null ] } cleave \ page boa ;
An easy "paging" mechanism that, given a page of results, can return the next page:
: next-page ( page -- page' ) [ url>> ] [ after>> "after" set-query-param ] bi json-page ;
And, using the make vocabulary, construct all the results into a sequence:
: all-pages ( page -- results ) [ [ [ results>> , ] [ dup after>> ] bi ] [ next-page ] while drop ] { } make concat ;
Domain Stats
We can retrieve all the results and produces a map of years that links were posted to a sum of the scores for each year's links:
: domain-stats ( domain -- stats ) "http://api.reddit.com/domain/%s" sprintf json-page all-pages [ created>> 1000 * millis>timestamp year>> ] group-by [ [ score>> ] map-sum ] assoc-map ;
Finally, a word to display a chart of the scores across a given list of domains:
: domains. ( domains -- ) H{ } clone [ '[ domain-stats [ swap _ at+ ] assoc-each ] each ] keep >alist sort-keys <bar> 160 >>height chart. ;
The code for this is available on my Github.
No comments:
Post a Comment