In PostgreSQL, window functions, also known as windowing or analytic functions, are a powerful feature for performing calculations across a set of rows related to the current row within the result set. Window functions are commonly used for tasks that involve ranking, aggregation, and comparing values within a specific window of rows. These functions provide more flexibility and control over querying and analyzing data compared to traditional aggregate functions.
Here are some key characteristics and concepts related to window functions in PostgreSQL:
1.Partitioning: Window functions often involve partitioning the result set into subsets or groups of rows based on one or more columns. These partitions define the scope within which the window function operates. Rows with the same values in the specified partitioning columns belong to the same window.
2.Ordering: Within each partition, you can define an order based on one or more columns. The ordering determines how the rows within the partition are arranged. It is essential for functions like ranking, cumulative sums, and calculating percentiles.
3.Window Frame: The window frame defines the range of rows relative to the current row that the window function operates on. You can specify the frame using keywords like ROWS BETWEEN
, RANGE BETWEEN
, or UNBOUNDED PRECEDING
to set the boundaries.
4.Window Function Syntax: Window functions are used in the SELECT
clause and have a distinct syntax. They are followed by an OVER
clause that specifies the partitioning, ordering, and frame clauses. Here's a basic syntax example:
SELECT
column1,
column2,
window_function(column3) OVER (
PARTITION BY partition_column
ORDER BY order_column
frame_clause
) AS result_column
FROM table_name;
Common window functions in PostgreSQL include:
1.ROW_NUMBER() Function:
This function assigns a unique integer to each row within a partition based on the specified order. It does not leave gaps, and consecutive rows receive consecutive integers.
Let's assume we have an "employees" table with the following data:
CREATE TABLE employees (
employee_id SERIAL PRIMARY KEY,
employee_name VARCHAR(100),
department VARCHAR(50),
salary DECIMAL(10, 2)
);
INSERT INTO employees (employee_name, department, salary)
VALUES
('Alice', 'HR', 50000),
('Bob', 'HR', 52000),
('Charlie', 'Finance', 60000),
('David', 'Finance', 55000),
('Eve', 'IT', 65000),
('Frank', 'IT', 62000);
ROW_NUMBER() Function
SELECT
department,
employee_name,
salary,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS row_num
FROM employees;
Expected Output:
department | employee_name | salary | row_num
-----------+---------------+--------+--------
Finance | Charlie | 60000 | 1
Finance | David | 55000 | 2
HR | Bob | 52000 | 1
HR | Alice | 50000 | 2
IT | Eve | 65000 | 1
IT | Frank | 62000 | 2
Partitioning: The result set is partitioned by the department column, creating separate partitions for each department.
Ordering: Within each department partition, rows are ordered by salary in descending order.
RANK() Function:
RANK()
assigns a rank to rows within a partition, and it leaves gaps in case of ties. If multiple rows have the same values, they receive the same rank, and the next rank is skipped.
Example:
SELECT
department,
employee_name,
salary,
RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS rank
FROM employees;
Expected Output:
-- If there is another employee in the Finance department who receives a salary of 55000, their rank will also be 2 but the next one will be 4 instead of 3.
department | employee_name | salary | rank
-----------+---------------+--------+------
Finance | Charlie | 60000 | 1
Finance | David | 55000 | 2
Finance | Mark | 55000 | 2
Finance | John | 45000 | 4
HR | Bob | 52000 | 1
HR | Alice | 50000 | 2
IT | Eve | 65000 | 1
IT | Frank | 62000 | 2
3.DENSE_RANK() Function:
DENSE_RANK()
is similar to RANK()
, but it does not leave gaps in case of ties. If multiple rows have the same values, they receive the same rank, and the next rank is not skipped.
Example:
SELECT
department,
employee_name,
salary,
DENSE_RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS dense_rank
FROM employees;
Expected Output:
-- If there is another employee in the Finance department who receives a salary of 55000, their rank will also be 2 and the next one will be 3.
department | employee_name | salary | dense_rank
-----------+---------------+--------+------------
Finance | Charlie | 60000 | 1
Finance | David | 55000 | 2
Finance | Mark | 55000 | 2
Finance | John | 45000 | 3
HR | Bob | 52000 | 1
HR | Alice | 50000 | 2
IT | Eve | 65000 | 1
IT | Frank | 62000 | 2
--If there is another employee in the Finance department who receives a salary of 55000, their rank will be 3
4.LEAD() Functions:
These function allow you to access values from the next row within the partition, respecting the specified order.
Example:
SELECT
department,
employee_name,
salary,
LEAD(salary) OVER (PARTITION BY department ORDER BY salary DESC) AS next_highest_salary
FROM employees;
Expected Output:
department | employee_name | salary | next_highest_salary
-----------+---------------+--------+---------------------
Finance | Charlie | 60000 | 55000
Finance | David | 55000 |
HR | Bob | 52000 | 50000
HR | Alice | 50000 |
IT | Eve | 65000 | 62000
IT | Frank | 62000 |
Top comments (0)