Long-running averages, without the sum of preceding values

Here’s a little lunch-time diversionary math.  Suppose you want a function that takes a number, and returns the average of all the numbers it’s been called with so far.  Handy for continuously updated displays, that kind of thing. Here’s a method that will return this averaging function.

private static Func MakeAverager()
{
    float sum = 0;
    int count = 0;
    return delegate(float x)
    {
        sum += x;
        count += 1;
        return sum/count;
    };
}

It creates sum and count variables for the function to close over.  The function takes a number, x, and adds it to sum.  It increments count, and divides.  Pretty standard.

Now, let’s get crazy, and pretend this code is going on Voyager, and it’ll be running for ever.  sum will get pretty high, right?  We’ll blow through 231, the upper-bound for .Net 32-bit ints.  Sure, we could make it a long, and go up to 263, but that’s not the point.  The point is, it’ll eventually run out, because sum is too high.

I’ve been chewing on this brain-teaser for a while.  I knew there must be a way to calculate a long-running average without storing the sum and the count; it seems the average so far, and the count, should be enough, but I don’t want to resort to ((average * count) + x) / count++, because that’s the exact same problem. (Of course, count could still get so high it overflows, but that’s somewhat less likely. Hush, you.)

I finally sat down and figured it out.  The trick is, each successive x tugs your average up or down — first by a lot, but by less over time.  With each x, the average gets harder to move:  the effect each new x has on the average is inversely proportionate to the count.  We can put it like this:

average += (x - average) / count

We tug average by x - average, the distance between them, scaled down by count.  Then, add that on to average (of course, if x < average, then x – average is negative, so it’ll tug the average down).

Let’s make a new averager.

private static Func MakeNewAverager()
{
    float average = 0;
    int count = 0;
    return delegate(float x)
    {
        average += (x - average) / ++count;
        return average;
    };
}

It works the same, but it’ll take a lot longer for count to overflow than sum.

For the record, here’s the ruby I used to sketch this idea out.  Of course, in ruby, this problem is even more bogus, because ruby’s Bignum can handle any number that your machine has enough free RAM to store.  But still.

def make_averager
    sum, count = 0, 0
    lambda { |x|
        sum += x
        count += 1
        sum.to_f / count
    }
end

def make_sum_free_averager
    avg = 0.0
    count = 0
    lambda { |x|
        count += 1
        avg += (x - avg) / count
    }
end