A great little article was posted last year about optimizing 2^x for doubles (by approximation). The author gets 40+% performance improvements in C#. I wondered if we could get similar improvements in Factor.

The basic idea is to use the fact that a number, `a`

can be factored into ^{b+c}`a`

. If we separate a floating point number into two components: an integer and a fractional part, we can show that:^{b} * a^{c}

`2`^{n} = 2^{integer(n)} * 2^{fractional(n)}

.

We are going to approximate this value by using a lookup table to compute the fractional part (within a specified precision). For example, to compute within a 0.001 precision, we need 1000 lookup values, essentially performing this calculation:

2^{n}= ( 1 << Int(n) ) * Table[ (int) ( Frac(n) * 1000 ) ];

## Implementation

So, we need a word that can split a floating point number into those two values:

: float>parts ( x -- float int ) dup >integer [ - ] keep ; inline

Instead of one table with 1000 values, we will copy the original authors decision to use three lookup tables for additional precision. The following code calculates these lookup tables:

CONSTANT: BITS1 10 CONSTANT: BITS2 $[ BITS1 2 * ] CONSTANT: BITS3 $[ BITS1 3 * ] CONSTANT: PRECISION1 $[ 1 BITS1 shift ] CONSTANT: PRECISION2 $[ 1 BITS2 shift ] CONSTANT: PRECISION3 $[ 1 BITS3 shift ] CONSTANT: MASK $[ PRECISION1 1 - ] CONSTANT: FRAC1 $[ 2 PRECISION1 iota [ PRECISION1 / ^ ] with map ] CONSTANT: FRAC2 $[ 2 PRECISION1 iota [ PRECISION2 / ^ ] with map ] CONSTANT: FRAC3 $[ 2 PRECISION1 iota [ PRECISION3 / ^ ] with map ]

The function `pow2`

looks pretty similar to our original mathematical definition:

: pow2 ( n -- 2^n ) >float 2^int 2^frac * >float ;

The guts of the implementation is in the `2^int`

and `2^frac`

words:

: 2^int ( n -- 2^int frac ) [ float>parts ] keep 0 >= [ 1 swap shift ] [ over 0 < [ [ 1 + ] [ 1 - ] bi* ] when 1 swap neg shift 1.0 swap / ] if swap ; inline : 2^frac ( frac -- 2^frac ) PRECISION3 * >fixnum [ BITS2 neg shift FRAC1 nth-unsafe ] [ BITS1 neg shift MASK bitand FRAC2 nth-unsafe ] [ MASK bitand FRAC3 nth-unsafe ] tri * * ; inline

## Testing

Let's try it and see how well it works for small values:

( scratchpad ) 2 1.5 ^ . 2.82842712474619 ( scratchpad ) 1.5 pow2 . 2.82842712474619

It seem's to work, how about larger values:

( scratchpad ) 2 16.3 ^ . 80684.28027297248 ( scratchpad ) 16.3 pow2 . 80684.28026255539

The error is clearly detectable, but to test that it really works the way we expect it too, we will need to calculate relative error:

: relative-error ( approx value -- relative-error ) [ - abs ] keep / ;

Our test case will generate random values, compute 2^{x} using our approximation and verify the relative error is less than 0.000000001 when compared with the correct result:

[ t ] [ 10000 [ -20 20 uniform-random-float ] replicate [ [ pow2 ] [ 2 swap ^ ] bi relative-error ] map supremum 1e-9 < ] unit-test

And to verify performance, we will benchmark it against the built-in (and more accurate `^`

word):

: pow2-test ( seq -- new old ) [ [ pow2 drop ] [ each ] benchmark ] [ 2 swap [ ^ drop ] with [ each ] benchmark ] bi ;

We got almost a 40% performance improvement at the cost of a small loss of precision - not bad!

( scratchpad ) 10000 [ -20 20 uniform-random-float ] replicate pow2-test / >float . 0.6293153754782392

The code for this is on my Github.

## 2 comments:

I think your first displayed equation probably wants to be a product rather than a sum?

@Rupert: Good catch! I've updated it. Thanks!

Post a Comment