DEV Community

Sumeet Bansal
Sumeet Bansal

Posted on

Finding mean, median, and mode using SQL.

It is very often that we need to calculate averages in our data for various analysis tasks. It quite easy when it come to calculating Mean, Median and Mode in tools such as Excel with the help of inbuilt functions.

In this this blog we will discuss ways to calculate these values using just SQL.

Let us consider table below to find mean, median, and mode of salaries department wise for all the departments.

╔═══════╦════════╦════════╗
║ empid ║ deptid ║ salary ║
╠═══════╬════════╬════════╣
║ 1     ║ 1      ║ 12000  ║
╠═══════╬════════╬════════╣
║ 2     ║ 2      ║ 3400   ║
╠═══════╬════════╬════════╣
║ 3     ║ 3      ║ 45000  ║
╠═══════╬════════╬════════╣
║ 4     ║ 4      ║ 1320   ║
╠═══════╬════════╬════════╣
║ 5     ║ 4      ║ 5600   ║
╠═══════╬════════╬════════╣
║ 6     ║ 2      ║ 2700   ║
╠═══════╬════════╬════════╣
║ 7     ║ 1      ║ 12000  ║
╠═══════╬════════╬════════╣
║ 8     ║ 3      ║ 34000  ║
╠═══════╬════════╬════════╣
║ 9     ║ 4      ║ 4600   ║
╠═══════╬════════╬════════╣
║ 10    ║ 1      ║ 9000   ║
╠═══════╬════════╬════════╣
║ 11    ║ 1      ║ 13000  ║
╠═══════╬════════╬════════╣
║ 12    ║ 2      ║ 12000  ║
╠═══════╬════════╬════════╣
║ 13    ║ 4      ║ 1320   ║
╠═══════╬════════╬════════╣
║ 14    ║ 3      ║ 24000  ║
╠═══════╬════════╬════════╣
║ 15    ║ 3      ║ 45000  ║
╠═══════╬════════╬════════╣
║ 16    ║ 2      ║ 3400   ║
╠═══════╬════════╬════════╣
║ 17    ║ 2      ║ 1230   ║
╠═══════╬════════╬════════╣
║ 18    ║ 1      ║ 15000  ║
╚═══════╩════════╩════════╝
Enter fullscreen mode Exit fullscreen mode

Finding mean is quite easy as it is just the average of all the salary for a department and we can use the AVG function for that.

SELECT deptid
    ,AVG(SALARY) AS mean_salary
FROM employee
GROUP BY deptid
Enter fullscreen mode Exit fullscreen mode

Calculating mode of salary for each department will be a bit tricky as we don't have inbuilt SQL functions for that.

Steps for it are :

  1. For this we start by calculating salary count for each unique salary in each department.

  2. Then we rank the salary by its frequency (salary count) in each department.

salary_counts AS (
    SELECT deptid,
        salary,
        COUNT(salary) AS salary_counts
    FROM employee 
    GROUP BY deptid, salary 
)
-- ranking salaries by frequency
, mode_salaries AS (
    SELECT deptid,
        salary,
        RANK() OVER( PARTITION BY deptid ORDER BY salary_counts DESC ) AS salary_counts_rank
    FROM salary_counts
)
SELECT deptid,
    salary as mode_salary
FROM mode_salaries
WHERE salary_counts_rank = 1
Enter fullscreen mode Exit fullscreen mode

For calculating median of salary for each department PERCENTILE_DISC SQL function along with GROUP can help us calculate that.

median_salaries AS (
    SELECT deptid,
        PERCENTILE_DISC(0.5) WITHIN GROUP(ORDER BY salary) AS median_salary
    FROM employee
    GROUP BY deptid
)
SELECT deptid,
    median_salary 
FROM median_salaries
Enter fullscreen mode Exit fullscreen mode

And that is how we can calculate mean, median, mode in SQL.

Top comments (0)