Thursday, April 19, 2012

Twin Primes

The most recent programming challenge from Programming Praxis is to:

Pairs of prime numbers that differ by two are known as twin primes: (3,5), (5,7), (11,13), (17,19), (29,31), (41,43), (59,61), (71,73),... Your task is to write a function that finds the twin primes less than a given input n.

Samuel Tardieu has contributed many improvements to Factor's math.primes vocabulary, which we will be using to solve this puzzle.

We can solve this puzzle in the naive method by computing all prime numbers up to a specified input, and then filtering them for pairs that differ by two:

: twin-primes-upto ( n -- seq )
    primes-upto 2 <clumps> [ first2 - abs 2 = ] filter ;

Another nice word would check to see whether any two numbers are twin primes (using short-circuit combinators to exit early if any of the conditions are not satisfied):

: twin-primes? ( x y -- ? )
    { [ - abs 2 = ] [ nip prime? ] [ drop prime? ] } 2&& ;

The puzzle page suggests a more efficient method for computing twin primes, which might be worth experimenting with...

Thursday, April 5, 2012

Faster Big Ratios!

While working on a post about approximating pi, I noticed that the performance of Factor's large rational numbers was less than stellar.

Specifically, we defined this estimation function to find pi as a rational number:

:: find-pi-to ( accuracy -- n approx )
    1 4 [
        over [ 2 * 1 + ] [ odd? [ neg ] when ] bi
        4 swap / [ + ] keep
        abs accuracy >= [ 1 + ] 2dip
    ] loop ;

Using this for an accuracy of 0.001 was incredibly slow:

IN: scratchpad [ 0.001 find-pi-to ] time
Running time: 1.692090043 seconds


In Factor, large integers are stored as arbitrary-precision "bignums". Similar to other languages such as Python and Scheme, Factor stores these numbers as a sequence of large (either 30-bit or 62-bit) digits, and then performs math on this sequence of digits.

It turns out that most of the ratios has both a bignum numerator and bignum denominator. For most basic math operation on ratios, Factor would compute the greatest common divisor to produce a normalized fraction (e.g., 6/4 would become 3/2).

For an accuracy of 0.001 in this example, the denominator had more than 1,700 digits in base 10. As these numbers grew, the performance of Euclid's GCD algorithm began degrading particularly due to the (relatively) expensive calculation of "mod" for bignum's.

Lehmer GCD

I noticed a Python bug report that suggested addressing a similar performance problem for their rational number implementation (in by implementing Lehmer's GCD algorithm.

After implementing a bignum-gcd primitive that used Lehmer's GCD, I created a fast-gcd word that used this for bignum's and the current gcd word for other real numbers.

Performance improvement was impressive! After recompiling with these patches, our original test case takes less than 10% of the time!

IN: scratchpad [ 0.001 find-pi-to ] time
Running time: 0.156713844 seconds

You can find this in the latest development version of Factor.