Calculating mean iteratively

#algorithms #python

Recently I ran into a situation, where I wanted to calculate a mean of an unknown size set. My first naive idea was to calculate the average value between current mean and the new value:

xs = [3, 7, 6] # Assuming we don't know the length

mean = x[0]
n = 1
while n < len(xs):
  mean = (mean + xs[n])/2
  n += 1

Which quickly reveals to be simply wrong:

(3 + 7 + 6)/3 = 5.3333
((5/2 + 7/2)/2 + 6/2) = 5.5

In order to find the actual relation between current mean and the i'th value, I started comparing mean from 2 and the mean from 3 values:

(3 + 7)/2 = 3/2 + 7/2
(3 + 7 + 6)/3 = (3 + 7)/3 + 6/3

From here, it's possible to rewrite mean from 3 values in terms of mean from 2 values:

(3 + 7)/3 + 6/3 = 
= (3 + 7)/2 * 2/3 + 6/3 =
= (3/2 + 7/2)*2/3 + 6/3

Where I noticed the pattern:

mean(i) = mean(i-1) * (i-1)/i + x(i-1)/i

Which gives us the correct algorithm for iterative calculation of the mean:

xs = [3, 7, 6] # Assuming we don't know length

mean = xs[0]
n = 1
while n < len(xs):
  n += 1
  mean = mean*(i-1)/i + xs[i-1]/i

While I know that in this example iterative calculation is unnescessary, I found this real handy for implementing a segment growing algorithm, where I decide what pixels to add to the segment based on current segment's mean value.

Top comments (1)

Paddy3118 • Jan 19 '20 • Edited

The normal way is to keep a count of, and the sum of, the numbers so far. The sum divided by the count is the mean at any point
Further statistics can calculate other values in a similar way, such as standard deviations.

DEV Community

Calculating mean iteratively

Top comments (1)

Read next

Straight to the Money 💰 minimalistic yet all-inclusive Python project template

Build an API to Keep Your Marketing Emails Out of Spam

Connect to multiple databases, make or generate SQL queries, analyze or visualize.

Telegram bot para replicar sinais no mt5