DEV Community

Discussion on: Calculating a Moving Average on Streaming Data

Collapse
 
mortoray profile image
edA‑qa mort‑ora‑y

I'm not sure if this corrects the floating point precision problem or swaps it out for a new one.

When N gets large, the original equation suffers from having a large accumulated value, thus running the risk of added samples not preserving their full precision -- in the worst case they become 0.

But, in the online approach, you're dividing each sample by N and adding to the current average. This has the same problem that as N grows the value of individual samples decreases, running the risk of not being significant compared to the current running average.

Collapse
 
nestedsoftware profile image
Nested Software • Edited

That's an excellent point! It does seem as though the average will basically stop changing once n gets high enough. Off the top of my head, it seems that one could calculate the average over a window of the most recent m values rather than the total n of all the values received so far. Maybe there's a better solution than that one though. What do you think?

Collapse
 
mortoray profile image
edA‑qa mort‑ora‑y

Yes, a windowed solution will work. It depends on the needs of the software. I've use the solution you've provided many times without problem, where accuracy wasn't vitally important.

Creating higher precision solutions can be challenging. An alternative, if speed isn't a big issue (which it usually isn't), is to use high precision number library, like libGmp, and use something like 1024bit floating point numbers. It's what I use for constants in the Leaf compiler.