## Introduction

Navigating the world of data often means operating in scenarios where **not all data points have the same importance** as one another

This is where the **weighted average**, a statistical tool that assigns importance to each value, helps us incorporate the **context of a situation** into our average calculations!

```
import numpy as np
```

With Python's **versatile ecosystem** we're able to leverage tools such as `numpy`

to quickly and efficiently calculate the weighted average in our analyses and data projects

## Table of contents

- Prerequisites and installation
- What is the weighted average?
- Examining a simple example
- Using np.average to calculate weighted mean
- Conclusion
- Additional resources

## Prerequisites and installation

The following package is a **prerequisite installation** for following along with this blog post!

To install it open your preferred **terminal/console** and run:

```
pip3 install numpy
```

## What is the weighted average?

The weighted average is an extension of a typical arithmetic mean that includes the importance (or **weight**) of each data point when calculating the average

In scenarios where all data points have the same importance, the weighted average **simplifies** to the standard arithmetic mean. However, when the significance of each data point varies the **weighted average** becomes a vital tool

## Examining a simple example

Let's consider an example where we are a **data scientist** employed by a university to calculate the average student grade across all classes in the school

To preserve the privacy of individual students we are only provided data **aggregated** at the class level and are thus given each individual class'

- average grade
- number of students

Our initial instinct might be to just take the usual average across all classes but what happens when comparing small classes to very large classes?

If a class has an average test score of 20/100 but only has 4 students is it fair to compare it to a class that has an average test score of 93 and 500 students? No!

If we did that the small class would be given an **outsized** level of importance as the test grades of just 4 students should not impact the overall mean as much as 500 students

So how do we incorporate the number of students into our university grade average?

With the **weighted average**!

## Using np.average to calculate weighted average

Continuing with the previous example let's say these are the `grades`

and their respective `number_of_students`

per class:

```
grades = [20, 93, 56, 79, 100, 86]
number_of_students = [4, 500, 93, 274, 12, 30]
```

To get the weighted average across the entire university using `numpy`

all we have to do is incorporate the weights into the `np.average`

:

```
import numpy as np
university_average = np.average(grades, weights=number_of_students)
print(university_average)
>>> 84.57174151150055
```

## Conclusion

And just like that we're able to quickly incorporate the weighted average into our projects by leveraging the `np.average`

's `weights`

argument

**Thanks so much for reading** and if you liked my content, be sure to check out some of my other work or **connect** with me on social media or my **personal website** ðŸ˜„

Cheers!

## Top comments (0)