DEV Community

Valts Liepiņš
Valts Liepiņš

Posted on

Calculating mean iteratively

Recently I ran into a situation, where I wanted to calculate a mean of an unknown size set. My first naive idea was to calculate the average value between current mean and the new value:

xs = [3, 7, 6] # Assuming we don't know the length

mean = x[0]
n = 1
while n < len(xs):
  mean = (mean + xs[n])/2
  n += 1
Enter fullscreen mode Exit fullscreen mode

Which quickly reveals to be simply wrong:

(3 + 7 + 6)/3 = 5.3333
((5/2 + 7/2)/2 + 6/2) = 5.5
Enter fullscreen mode Exit fullscreen mode

In order to find the actual relation between current mean and the i'th value, I started comparing mean from 2 and the mean from 3 values:

(3 + 7)/2 = 3/2 + 7/2
(3 + 7 + 6)/3 = (3 + 7)/3 + 6/3
Enter fullscreen mode Exit fullscreen mode

From here, it's possible to rewrite mean from 3 values in terms of mean from 2 values:

(3 + 7)/3 + 6/3 = 
= (3 + 7)/2 * 2/3 + 6/3 =
= (3/2 + 7/2)*2/3 + 6/3
Enter fullscreen mode Exit fullscreen mode

Where I noticed the pattern:

mean(i) = mean(i-1) * (i-1)/i + x(i-1)/i
Enter fullscreen mode Exit fullscreen mode

Which gives us the correct algorithm for iterative calculation of the mean:

xs = [3, 7, 6] # Assuming we don't know length

mean = xs[0]
n = 1
while n < len(xs):
  n += 1
  mean = mean*(i-1)/i + xs[i-1]/i
Enter fullscreen mode Exit fullscreen mode

While I know that in this example iterative calculation is unnescessary, I found this real handy for implementing a segment growing algorithm, where I decide what pixels to add to the segment based on current segment's mean value.

Oldest comments (1)

Collapse
 
paddy3118 profile image
Paddy3118 • Edited

The normal way is to keep a count of, and the sum of, the numbers so far. The sum divided by the count is the mean at any point
Further statistics can calculate other values in a similar way, such as standard deviations.