SQL 301 (2 Part Series)
Welcome to the world of Window Functions.
Window Functions are closely related to Aggregate Functions. But rather than collapsing all the rows we want into one row, we keep them and add a new column with a running total, rank or moving average. This becomes our 'window frame.'
It becomes clearer with examples and pictures, so if you're interested in using this powerful tool, keep reading.
SQL Server Window Functions come in three main types. These are:
Aggregate Window Functions
These use aggregate functions like SUM, COUNT, MAX, MIN over a set of rows and return a single result from the query.
Ranking Window Functions
These assign a 'rank' to a set of rows and using RANK, DENSE_RANK, ROW_NUMBER, NTILE
Value Window Functions
These use LAG, LEAD, FIRST_VALUE, LAST_VALUE to access a previous row without having to do a self join.
To get started with Window Functions lets begin with the syntax for an Aggregate Window Function and how each part works.
You will need:
The function you want to perform: AVG, SUM, COUNT
An indication you want to use this Function over multiple rows: OVER
How you want to group your rows - PARTITION
How you want to order your rows - ORDER BY
To make sense of the syntax and show the use cases I'm going to tackle some real-life problems faced by a Sales Analyst.
In this example, the Sales Manager has come to us with a request.
He's setting targets for the team for next year and needs 'all the data' from this year's orders. We have the data available in the Sales database and he'd like it in a format that's easier to use than the raw table.
First, he'd like to see daily sales totals but also wants to be able to see each orderid, so he can 'drill down' if need be.
We could tackle this using SUM to total up all the rows on the Orders table, but this collapses down the order details. A Window Function will allow us to see each order with a running total for the Sales Manager.
select sale_date, salesorderid, subtotal, sum(subtotal) over(partition by sale_date order by salesorderid) as total_sales from sales.salesorderheader where orderdate between '2018-01-01 00:00:00:000' and '2018-12-31 00:00:00:000' order by sale_date
By using a Window Function we can see each order for each day and its total, with a running total along the side. Other aggregate functions work the same way, so you can use COUNT, AVG, MIN or MAX, or in combination.
The Sales Manager is back.
He was happy with the table created for setting targets, but now he wants some strategies to increase sales. This time he wants to see his teams sales for 2018 by the customer and dollar value. If he can spot when those big sales were in the past, maybe he could go knock on their door again this year?
To give him the data he needs we need to use a Window Function with a RANK.
select sale_date, salesorderid, subtotal, rank() over(order by subtotal desc) as sales_rank from sales.salesorderheader where orderdate between '2018-01-01 00:00:00:000' and '2018-12-31 00:00:00:000' order by sale_date
Using RANK() in a Window Function ranks each row, in this case, by subtotal. The Sales Manager can now decide if he wants to target those customers again in the coming year.
If there are identical rows, like in the example above, they all receive the same rank and places 2 and 3 are skipped. To get around this we can use DENSE_RANK which would have kept rank 2 and 3 before moving on to fourth place.
There's much more to learn and useful ways to use Window Functions, check out Part Two!
This post originally appeared on helenanderson.co.nz
Photo by Yulianto Poitier from Pexels