DEV Community

Thales Bruno
Thales Bruno

Posted on • Updated on • Originally published at thalesbr.uno

Mean, Median, Mode

In this article, we are talking about the common methods to measure the central tendency of the data, that is a way to explain our data in some manner. Although these measurements are quite simple, they are a foundation for a lot of other more complex measurements.

We will use this fake list of grades of a class below to demonstrate the concepts:

Grades
9.2
7.5
8
9
8.5
8
2.5

We create a Python variable to store the grades:

grades = [9.2, 7.5, 8, 9, 8.5, 8, 2.5]
Enter fullscreen mode Exit fullscreen mode

Mean

The first one is the mean, which is the arithmetic average of the data. To calculate the mean of a set of numeric data, we take the sum of all observations divided by the number of observations.

In Python:

from typing import List

def mean(xs: List[float]) -> float:
  return sum(xs)/len(xs)
Enter fullscreen mode Exit fullscreen mode

So, applying in our grades data: mean(grades) gives us ~7.52.

Median

The second one is the median, which gives us something more positional. The median is the exact central value of a data, so to calculate it we need to:

  • Sort data from smaller to largest, and
  • If ODD length: pick up the middle value
  • If EVEN length: calculate the mean of the two middle values

So, in Python could be like this:

def median(xs: List[float]) -> float:
  s = sorted(xs)
  # If odd lenght, just pick up the middle value
  if len(xs) % 2:
    p = len(xs)//2
    return xs[p]
  # If even lenght, take the mean of the left and right middle values
  else:
    l_p = (len(xs) // 2) - 1
    r_p = len(xs) // 2
    mid_values = [l_p, r_p]
    return mean(mid_values)
Enter fullscreen mode Exit fullscreen mode

This way, the grades sorted looks like this: [2.5, 7.5, 8, 8, 8.5, 9, 9.2] and their median, median(grades), is 8.

Mode

Finally, we have the mode, i.e., the observation that occurs the most in a dataset. Thus, a dataset may have one, multiple, or even none mode.

Python mode function:

from collections import Counter
def mode(xs: List[float]) -> List[float]:
  counter = Counter(xs)
  return [x_i for x_i, count in counter.items() if count == max(counter.values())]
Enter fullscreen mode Exit fullscreen mode

The grade that occurs the most in our dataset is [8] as we can see calling mode(grades)

Python mean, median, and mode

In this article, we made some Python code to illustrate our measurements, just as didactic purposes. By the way, we already have functions in Python libraries that make the job. Bellow, some of them:

Top comments (2)

Collapse
 
mohsenalyafei profile image
Mohsen Alyafei

Thanks. This is good.

Collapse
 
thalesbruno profile image
Thales Bruno

I am glad you liked it!