DEV Community

Cover image for How to Transform SQL Queries to Crosstabs in PostgreSQL
Kagunda JM
Kagunda JM

Posted on • Originally published at dev.to

How to Transform SQL Queries to Crosstabs in PostgreSQL

A SELECT SQL query retrieves data from a database in a tabular form. The first row, or header row, has the column names, and all the other rows have the data that was retrieved.

Listing of salesmen monthly sales data

In a crosstab, the data is condensed and the names of one or more columns are rotated. You can add row and column totals to a crosstab. For example, a list of salesmen's monthly sales may include the months in the column headers. By rotating the data, it is easier to read and understand how the facts relate to one another.

Sample crosstab

Other names you might come across for crosstabs are matrix reports, pivot for SQL server databases, and pivot tables on spreadsheets like Microsoft Excel, Google Sheets, and LibreOffice.

The following topics are covered in this post:

This post uses queries from a database populated with data from the Northwind database for Postgres. The queries are run using DBeaver SQL client.

Crosstabs Using CASE Conditional Expression

The SQL CASE expression enables you to choose a value depending on a condition, much like an if-then-else conditional statement. The syntax for the CASE expression is as follows:

CASE WHEN condition THEN result
     [WHEN ...]
     [ELSE result]
END
Enter fullscreen mode Exit fullscreen mode

If a condition evaluates to true, the result value will be chosen, otherwise; the result value contained in the optional ELSE expression will be chosen.

In the example below, the SQL CASE statement is used to make a crosstab of monthly total sales by employees for the first four months of 1997.

If you run the query above, you will get a crosstab like the one below.

employees monthly total sales crosstab during the first four months of 1997

You can combine the previous query with a UNION ALL operator and the following SQL SELECT query to include the total for each month in the cross-tab.

employees monthly total sales crosstab during the first four months 1997 including month totals using SQL CASE expression

The SQL CASE expression is supported by most database systems.

Using Common Table Expressions (CTE) For A Crosstab

Common Table Expressions (CTE) in SQL let you create a temporary table in a SELECT SQL statement. It is then possible to reference this named temporary table in subsequent SELECT, INSERT, UPDATE, or DELETE SQL statements. The CTEs simplify the SQL queries by breaking them down into smaller, more manageable parts that are easier to read and understand. In PostgreSQL, CTEs are called WITH queries.

You can rewrite the previous CASE crosstab query using a CTE as follows:

Crosstabs Using Aggregate FILTER Clause

From PostgreSQL 9.4, you can use the FILTER clause to perform aggregate functions on specific records. The FILTER clause is less wordy and has a cleaner syntax than the CASE statement.

The following SQL query uses the FILTER clause to generate a crosstab.

The SQL COALESCE function replaces any null values in the crosstab with zero (0) values.

employees monthly total sales crosstab during the first four months of 1997 including month totals using aggregate FILTER clause

Using PostgreSQL crosstab() Function

The crosstab() function is part of the optional tablefunc module.

You can run the SELECT COUNT(*) FROM pg_extension WHERE extname='tablefunc'; query to see if the tablefunc extension is installed on the database you are using. If the result of the query is 0, install and activate the tablefunc⁣ extension using the CREATE EXTENSION IF NOT EXIST tablefunc;⁣ SQL command. The tablefunc⁣ module can be installed by non-superusers who have the CREATE⁣ privilege.

The crosstab function has several options, crosstab(sql text), crosstabN(sql text), and crosstab(source_sql text, category_sql text). Examples in this post use the crosstab(source_sql text, category_sql text) function option.

The first parameter (source_sql text) in a crosstab(source_sql text, category_sql text) function is the source SQL SELECT query statement and must return at least three (3) columns of data to pivot or rotate. The first column (row_name) contains data values to be used as row identifiers in the final result; data in the second column (category) represents category values that will be rotated to column headers in the pivot table, and the third column (value) contains data to be assigned to each cell of the final crosstab. The second parameter text category_sql is a query returning a category list for the columns.

The crosstab function returns a set of records with unknown data types. Therefore, you must alias the returned columns with column names and types using the AS (col1 type, col2 type, ...) clause. Failing to alias the columns will cause a column definition list is required for functions returning "record" error.

The SQL query below shows how to use the crosstab(source_sql text, category_sql text) function. A dollar-quoted string constant ($$) has been used to maintain formatting consistency and remove the need to escape single quotes (') by doubling them.

employees monthly total sales crosstab during the first four months of 1997 using the crosstab() function

Using A PostgreSQL crosstab() Function With More Than Three Columns

If your source_sql text SQL query returns more than three columns, the additional or extra columns must be placed between the row_name and category columns. In the previous crosstab SQL query, any extra or additional columns must be placed between the salesman and order_month columns.

PostgreSQL converts identifiers/column names to lowercase by default. To capitalize the crosstab column headers, enclose them within double quotes.

employees monthly total sales crosstab during the first four months of 1997 using crosstab() function with more than 3 columns

Using ARRAY Data Type To Re-Arrange Extra Crosstab Columns

A crosstab with more than three columns has limitations in the order placement of the additional columns in the final crosstab. However, you can re-arrange the crosstab columns by inserting the extra columns into an arrays data type.

employees monthly total sales crosstab during the first four months 1997 using crosstab() function with re-arranged columns

Conclusion

Crosstabs are a powerful way to summarize and analyze data from a database. By presenting data in a condensed and organized format, crosstabs make it easier to analyze relationships between different variables.

The CASE conditional expression, aggregate FILTER clause, and the PostgreSQL crosstab() function are some of the methods for creating crosstabs. If the crosstab() function uses more than three columns to summarize data, the ARRAY data type may be used to re-arrange the extra columns into the correct positions. Using crosstabs, PostgreSQL users can create reports and dashboards that help them make informed business decisions.

Resources

  1. CrossTab Queries in PostgreSQL using tablefunc contrib
  2. Static and dynamic pivots

Top comments (0)