DEV Community

Judy
Judy

Posted on

How to Calculate Median on Values That Have Corresponding Quantities

We have a database table TBLCARS as follows:

Image description
We are trying to expand each PRICE value according to the corresponding QUANTITY and calculate the PRICE median.

SQL written in SQL Server:

WITH median_dt AS (
                        SELECT CASE
                                                WHEN sum(quantity) % 2 = 0 THEN sum(quantity) / 2
                                                ELSE sum(quantity) / 2 + 1
                                    END AS lower_limit

                                    , CASE

                                                WHEN sum(quantity) % 2 = 0 THEN sum(quantity) / 2 + 1
                                                ELSE sum(quantity) / 2 + 1
                                    END AS upper_limit
                        FROM t
            ),
            data AS (
                        SELECT *, sum(quantity) OVER (ORDER BY price ASC) AS cum_sum
                        FROM t
            ),
            rnk_val AS (
                        SELECT *
                        FROM (
                                    SELECT price, row_number() OVER (ORDER BY d.cum_sum ASC) AS rnk
                                    FROM data d
                                                JOIN median_dt b ON b.lower_limit <= d.cum_sum
                        ) x
                        WHERE x.rnk = 1
                        UNION ALL
                        SELECT *
                        FROM (
                                    SELECT price, row_number() OVER (ORDER BY d.cum_sum ASC) AS rnk
                                    FROM data d
                                                JOIN median_dt b ON b.upper_limit <= d.cum_sum
                        ) x
                        WHERE x.rnk = 1
            )
SELECT avg(price) AS median
FROM rnk_val
Enter fullscreen mode Exit fullscreen mode

Not difficult at all. We just need to extend the PRICE value by copying it QUANTITY times, concatenate the result records, and calculate the PRICE median. As SQL sets are unordered and the language does not have an median function, we can only invent indexes using a complex method, concatenate result records, and then calculate the median.

It is easy to write the algorithm in the open-source esProc SPL:

Image description
Based on the ordered set, SPL is a professional data computing engine that offers all-round functions for performing set-oriented calculations and a wealth of mathematical functions. It combines merits of Java and SQL, making it convenient to handle the dynamic expansion cases.

Top comments (0)