Monday, October 14, 2013

Kahan Summation

The Kahan summation algorithm is a "compensated summation" that "significantly reduces the numerical error in the total obtained by adding a sequence of finite precision floating point numbers, compared to the obvious approach".

Matt Adereth wrote a blog post a few days ago demonstrated some advantages of using Kahan, which I show below using Factor.

The Problem

To demonstrate the problem, let's consider the harmonic series (e.g., `1 + 1/2 + 1/3 + 1/4 + ...`) as a series of rational numbers:

```: harmonic-ratios ( n -- seq )
[1,b] [ recip ] map ;```

It gives the first `n` of the harmonic series:

```IN: scratchpad 6 harmonic-ratios .
{ 1 1/2 1/3 1/4 1/5 1/6 }```

We define a "simple sum" as just adding the numbers in order:

```: simple-sum ( seq -- n )
0 [ + ] reduce ;```

The simple sum of the first 10,000 harmonic ratios as a floating point number is:

```IN: scratchpad 10,000 harmonic-ratios simple-sum >float .
9.787606036044382```

But, if we use floating points instead of ratios to represent the harmonic numbers (e.g., `1.0 + 0.5 + 0.33333333 + 0.25 + ...`):

```: harmonic-floats ( n -- seq )
harmonic-ratios [ >float ] map! ;```

You can see that an error has been introduced (ending in "`48`" instead of "`82`"):

```IN: scratchpad 10,000 harmonic-floats simple-sum .
9.787606036044348```

If we reverse the sequence and add them from smallest to largest, there is a slightly different error (ending in "`86`"):

```IN: scratchpad 10,000 harmonic-floats reverse simple-sum .
9.787606036044386```

The Solution

The pseudocode for the Kahan algorithm can be seen on the Wikipedia page, using a running compensation for lost low-order bits:

```function KahanSum(input)
var sum = 0.0
var c = 0.0
for i = 1 to input.length do
var y = input[i] - c
var t = sum + y
c = (t - sum) - y
sum = t
return sum```

We could translate this directly using local variables to hold state, but instead I tried to make it a bit more concatenative (and possibly harder to read in this case):

```: kahan+ ( c sum elt -- c' sum' )
rot - 2dup + [ -rot [ - ] bi@ ] keep ;

: kahan-sum ( seq -- n )
[ 0.0 0.0 ] dip [ kahan+ ] each nip ;```

You can see both forward and backward errors no longer exist:

```IN: scratchpad 10,000 harmonic-floats kahan-sum .
9.787606036044382

IN: scratchpad 10,000 harmonic-floats reverse kahan-sum .
9.787606036044382```

Nifty! Thanks, Matt!