Wednesday, February 15, 2023

DuckDuckGo

The conversation around the current quality of web search engines, the doomsday prediction about various incumbents, and the equal parts inspiring and challenging rollout of large language models to improve search has been fascinating to watch. There are many challengers in the search engine space including companies like Kagi and Neeva among many search engine startups. One privacy-focused startup that has been fun to follow for awhile has been DuckDuckGo.

You can see an example of the DuckDuckGo API that is available on api.duckduckgo.com. This does not provide access to their full search results, but instead provides access to their instant answers. Regardless, I thought it would be neat if we could use this from Factor.

We can take a search query and turn it into a URL object:

: duckduckgo-url ( query -- url )
    URL" http://api.duckduckgo.com"
        swap "q" set-query-param
        "json" "format" set-query-param
        "1" "pretty" set-query-param
        "1" "no_redirect" set-query-param
        "1" "no_html" set-query-param
        "1" "skip_disambig" set-query-param ;

Using the http.client vocabulary and the json vocabulary we can retrieve a result set:

: duckduckgo ( query -- results )
    duckduckgo-url http-get nip utf8 decode json> ;

We can make a word that prints out the abstract response with clickable links:

: abstract. ( results -- )
    dup "Heading" of [ drop ] [
        swap {
            [ "AbstractURL" of >url write-object nl ]
            [ "AbstractText" of print ]
            [ "AbstractSource" of "- " write print ]
        } cleave nl
    ] if-empty ;

And then a word that prints out a result response, parsing the HTML using the html.parser vocabulary and output as text using the html.parser.printer vocabulary:

: result. ( result -- )
    "Result" of [
        "<a href=\"" ?head drop "\">" split1 "</a>" split1
        [ swap >url write-object ]
        [ parse-html html-text. nl ] bi*
    ] when* ;

There are more aspects to the response from the API, but we can initially print out the abstract, the results, and the related topics:

: duckduckgo. ( query -- )
    duckduckgo {
        [ abstract. ]
        [ "Results" of [ result. ] each ]
        [ "RelatedTopics" of [ result. ] each ]
    } cleave ;

We can try it out on a topic that this particular blog likes to discuss:

IN: scratchpad "factorcode" duckduckgo.
Factor (programming language)
Factor is a stack-oriented programming language created by Slava
Pestov. Factor is dynamically typed and has automatic memory
management, as well as powerful metaprogramming features. The
language has a single implementation featuring a self-hosted
optimizing compiler and an interactive development environment.
The Factor distribution includes a large standard library.
- Wikipedia

Official site - Factor (programming language)
Concatenative programming languages
Stack-oriented programming languages
Extensible syntax programming languages
Function-level languages
High-level programming languages
Programming languages
Software using the BSD license

This is available on my GitHub.

No comments: