# Calculating mean iteratively

### Valts Liepiņš ・1 min read

Recently I ran into a situation, where I wanted to calculate a mean of an unknown size set. My first naive idea was to calculate the average value between current mean and the new value:

```
xs = [3, 7, 6] # Assuming we don't know the length
mean = x[0]
n = 1
while n < len(xs):
mean = (mean + xs[n])/2
n += 1
```

Which quickly reveals to be simply wrong:

```
(3 + 7 + 6)/3 = 5.3333
((5/2 + 7/2)/2 + 6/2) = 5.5
```

In order to find the actual relation between current mean and the i'th value, I started comparing mean from 2 and the mean from 3 values:

```
(3 + 7)/2 = 3/2 + 7/2
(3 + 7 + 6)/3 = (3 + 7)/3 + 6/3
```

From here, it's possible to rewrite mean from 3 values in terms of mean from 2 values:

```
(3 + 7)/3 + 6/3 =
= (3 + 7)/2 * 2/3 + 6/3 =
= (3/2 + 7/2)*2/3 + 6/3
```

Where I noticed the pattern:

```
mean(i) = mean(i-1) * (i-1)/i + x(i-1)/i
```

Which gives us the correct algorithm for iterative calculation of the mean:

```
xs = [3, 7, 6] # Assuming we don't know length
mean = xs[0]
n = 1
while n < len(xs):
n += 1
mean = mean*(i-1)/i + xs[i-1]/i
```

While I know that in this example iterative calculation is unnescessary, I found this real handy for implementing a segment growing algorithm, where I decide what pixels to add to the segment based on current segment's mean value.

The normal way is to keep a count of, and the sum of, the numbers so far. The sum divided by the count is the mean at any point

Further statistics can calculate other values in a similar way, such as standard deviations.