DEV Community

Cover image for Calculating a Moving Average on Streaming Data

Calculating a Moving Average on Streaming Data

Nested Software on March 20, 2018

Recently I needed to calculate some statistics (the average and the standard deviation) on a stream of incoming data. I did some research about it,...
Collapse
 
mortoray profile image
edA‑qa mort‑ora‑y

I'm not sure if this corrects the floating point precision problem or swaps it out for a new one.

When N gets large, the original equation suffers from having a large accumulated value, thus running the risk of added samples not preserving their full precision -- in the worst case they become 0.

But, in the online approach, you're dividing each sample by N and adding to the current average. This has the same problem that as N grows the value of individual samples decreases, running the risk of not being significant compared to the current running average.

Collapse
 
nestedsoftware profile image
Nested Software • Edited

That's an excellent point! It does seem as though the average will basically stop changing once n gets high enough. Off the top of my head, it seems that one could calculate the average over a window of the most recent m values rather than the total n of all the values received so far. Maybe there's a better solution than that one though. What do you think?

Collapse
 
mortoray profile image
edA‑qa mort‑ora‑y

Yes, a windowed solution will work. It depends on the needs of the software. I've use the solution you've provided many times without problem, where accuracy wasn't vitally important.

Creating higher precision solutions can be challenging. An alternative, if speed isn't a big issue (which it usually isn't), is to use high precision number library, like libGmp, and use something like 1024bit floating point numbers. It's what I use for constants in the Leaf compiler.

Collapse
 
sam_ferree profile image
Sam Ferree

I might be missing something, but why not save the sum, and run the update function like so?

update(newValue) {
    this.count++;
    this.sum += newValue;
    this._mean = this.sum/this.count;
}

If you have a problem where a large accumulated value, and a desire for full precision are both in scope, then you simply use a datatype better suited to handle this than float.

Collapse
 
nestedsoftware profile image
Nested Software • Edited

Yes, if you are okay accumulating the sum like this, it’s fine. With floating point values, this approach can have issues, but if it meets your needs (e.g. using a different data type as you mentioned) then it’s okay too. Everything depends on one’s particular use case. As @edA-qa mentioned, the approach I present here can have some problems too when n gets big, so it’s not perfect either. I believe it can in turn be corrected with, for example, something like an exponential moving average, but that’s a whole other story. There are many ways to skin a cat!

Collapse
 
slidenerd profile image
slidenerd • Edited

A much faster way to do that JS version is to get rid of reduce and use the plain old for loop, I could setup a JSPERF test to prove that the plain old for loop kicks ass when it comes to performance compared to map filter reduce and the other crappy ES6 methods, mentioning this since this post says "real time"

Also a better approach is
sum = sum - oldestValue
sum = sum + newestValue

where oldestValue is just beyond the moving average s period

Collapse
 
melroy89 profile image
Melroy van den Berg

Wow impressive, thanks!! I never learned so much from a single article.

Collapse
 
nestedsoftware profile image
Nested Software

Thank you so much!

Collapse
 
mikerg profile image
mikerg • Edited

this is awesome - do you know if this technique could be applied to computing a percentile such as 90% in web log analysis?

Collapse
 
nestedsoftware profile image
Nested Software • Edited

In the next article, I will show how to calculate variance and standard deviation incrementally. Assuming your distribution is applicable, that could work I think.

Collapse
 
noblebe4st profile image
Jeff Hall

Very cool, thanks for sharing. Mind you, I’ll have to read this a couple more times, and then I’ll be waiting anxiously for anything resembling a nail to use this hammer on.

Collapse
 
nestedsoftware profile image
Nested Software

Thank you!