In this article, we will take a look at the Pivot function in GBase 8s, which transforms row data into column data. Whether for reporting or data analysis, the Pivot function can help us view data from different perspectives.
Overview of the Pivot Function in GBase 8s
Pivot, or row-to-column transformation, is a process of converting row data in a table into column data. By performing a Pivot operation, we can transform one or more columns of data into multiple columns, allowing for better horizontal statistics and analysis.
Unpivot, or column-to-row transformation, is the opposite of Pivot and converts column data into row data, which is better suited for vertical statistics and analysis.
Pivot and Unpivot are two very useful functions in data transformation. They can alter the presentation of data as needed, making it easier to understand and analyze. These functions are widely used in database queries, data analysis, and report generation. We will introduce the Unpivot function in the next article.
Detailed Syntax of GBase 8s Pivot
The process of Pivot transformation: multiple rows are aggregated and then transformed into columns, with each column representing a different range of aggregated data. The syntax is as follows:
SELECT ...
FROM ...
PIVOT
(
pivot_clause --- The value of the column after the row-to-column transformation, an aggregate value that supports multiple columns.
pivot_for_clause --- The field that needs to be transformed into columns, turning one column into multiple columns.
pivot_in_clause --- A filter for the pivot_for_clause field, specifying the values to be transformed into columns.
)
WHERE ...
Pivot follows directly after the FROM
clause and is placed between FROM
and WHERE
.
In simple terms, the syntax is as follows:
PIVOT (aggregate_function1 AS alias1, aggregate_function2 AS alias2 ...)
FOR column_to_transform
IN ((value1_for_column1, value2_for_column2) AS alias1, (value1_for_column2, value2_for_column2) AS alias2...)
Example of GBase 8s Pivot
Note: The following examples use the GBase8sV8.8_TL_3.5.1_x86_64 database version.
Here are some simple demonstrations of Pivot syntax using the emp
table. Before the transformation, let’s take a look at the base data:
create table emp(empno int,ename varchar(10),job varchar(9),sal1 DECIMAL(10,2),sal2 DECIMAL(10,2), deptno int);
-- Insert 10 rows of data:
INSERT INTO emp VALUES (1, 'smith', 'clerk', 800.00,910,10);
INSERT INTO emp VALUES (2, 'allen', 'salesman', 1600.00,5491,30);
INSERT INTO emp VALUES (3, 'ward', 'salesman', 1250.00,3429,30);
INSERT INTO emp VALUES (4, 'jones', 'manager', 2975.00,1298,20);
INSERT INTO emp VALUES (5, 'martin', 'salesman',1250.00,1256.34,20);
INSERT INTO emp VALUES (6, 'blake', 'manager', 2850.00,1234.34,20);
INSERT INTO emp VALUES (7, 'clark', 'manager', 2450.00,1000,10);
INSERT INTO emp VALUES (8, 'scott', 'clerk', 3000.00,2000,20);
INSERT INTO emp VALUES (9, 'king', 'president',5000.00,4500,10);
INSERT INTO emp VALUES (10, 'turner', 'salesman', 1500.00,1000,10);
> select job,deptno,sum(sal1) as sum_sal1 from emp group by job,deptno order by job,deptno;
job deptno sum_sal1
clerk 10 800.00
clerk 20 3000.00
manager 10 2450.00
manager 20 5825.00
president 10 5000.00
salesman 10 1500.00
salesman 20 1250.00
salesman 30 2850.00
8 row(s) retrieved.
ORDER BY job, deptno;
Case 1: Single Column Transformation
Here, we summarize sal1
by deptno
and job
, then convert the department numbers into their respective columns:
Aggregated column: sal1
Column to transform: deptno
Values to filter for transformation: 10, 20, 30, 40
select * from (SELECT deptno,job, sal1 FROM emp)a
PIVOT
(
SUM(sal1) --pivot_clause
FOR deptno --pivot_for_clause
IN (10, 20,30,40) --pivot_in_cluase
) ;
Result:
job 10 20 30 40
president 5000.00
manager 2450.00 5825.00
salesman 1500.00 1250.00 2850.00
clerk 800.00 3000.00
4 row(s) retrieved.
Points to Note About This Example:
The pivot operation actually performs an implicit GROUP BY
using the columns not appearing in the pivot_clause
(in this case, job
and deptno
). Most transformation queries are executed on specific subsets of columns. As with all aggregate queries, the presence of additional columns affects the grouping. In this example, all columns except sal1
become the grouping set, with deptno
being the transformed column. If other columns from the base data, such as ename
, are included, they would affect the grouping. Therefore, you can use subqueries to define the set of base columns, as in the example where SELECT deptno, job, sal1 FROM emp
defines the base set of columns.
Case 2: Effect of Involving All Columns
SELECT * FROM emp
PIVOT (
SUM(sal1)
FOR deptno
IN (10, 20, 30)
);
Result:
empno 2
ename allen
job salesman
sal2 5491.00
10
20
30 1600.00
empno 7
ename clark
job manager
sal2 1000.00
10 2450.00
20
30
empno 3
ename ward
job salesman
sal2 3429.00
10
20
30 1250.00
empno 8
ename scott
job clerk
sal2 2000.00
10
20 3000.00
30
empno 1
ename smith
job clerk
sal2 910.00
10 800.00
20
30
empno 9
ename king
job president
sal2 4500.00
10 5000.00
20
30
empno 5
ename martin
job salesman
sal2 1256.34
10
20 1250.00
30
empno 4
ename jones
job manager
sal2 1298.00
10
20 2975.00
30
empno 10
ename turner
job salesman
sal2 1000.00
10 1500.00
20
30
empno 6
ename blake
job manager
sal2 1234.34
10
20 2850.00
30
10 row(s) retrieved.
In this example, all columns except sal1
become the grouping set, with deptno
being the transformed column. This transformation doesn't make much sense because it includes all columns in the grouping, leading to results that may not be meaningful.
Case 3: Using WITH AS
Below is an example using a WITH AS
subquery, achieving the same result as the above inline view.
WITH pivot_data AS (SELECT deptno, job, sal1 FROM emp)
SELECT * FROM PIVOT_data
PIVOT
(
SUM(sal1)
FOR deptno
IN (10, 20, 30,40)
);
Result:
job 10 20 30 40
president 5000.00
manager 2450.00 5825.00
salesman 1500.00 1250.00 2850.00
clerk 800.00 3000.00
4 row(s) retrieved.
Case 4: Multiple Column Transformation
select * from (SELECT deptno,job, sal1,sal2 FROM emp)a
PIVOT
(
SUM(sal1) AS sum_sal1,
SUM(sal2) AS sum_sal2
FOR deptno
IN (10,20)
);
Result:
job 10_sum_sal1 20_sum_sal1 10_sum_sal2 20_sum_sal2
president 5000.00 4500.00
manager 2450.00 5825.00 1000.00 2532.34
salesman 1500.00 1250.00 1000.00 1256.34
clerk 800.00 3000.00 910.00 2000.00
4 row(s) retrieved.
In the above example, two aggregates are defined in the pivot_clause
, resulting in double the number of transformed columns (since the number of aggregates is doubled).
-
Number of transformed columns = Number of aggregate columns × Number of values in
pivot_in_clause
(2×2=4). - The number of transformed columns cannot exceed 1024.
Case 5: Multiple Columns and Multiple Values Transformation
SELECT * FROM (SELECT deptno,job, sal1,sal2 FROM emp)a
PIVOT
(
SUM(sal1) AS sum_sal1,
SUM(sal2) AS sum_sal2
FOR (deptno, job)
IN ((10, 'clerk') as a,(20, 'manager') as b)
);
Result:
a_sum_sal1 b_sum_sal1 a_sum_sal2 b_sum_sal2
800.00 5825.00 910.00 2532.34
1 row(s) retrieved.
Notes on GBase 8s Pivot Transformation Queries
- The column used in
pivot_for_clause
(e.g.,deptno
in the examples) cannot appear in the SELECT projection list. The columns inpivot_for_clause
are grouped based on the range of values provided inpivot_in_clause
. Attempting to query thedeptno
column will result in an error becausedeptno
is completely removed from the projection.
Example:
> SELECT deptno
FROM emp
PIVOT
(
SUM(sal1)
FOR deptno
IN(10, 20, 30, 40)
);
217: Column (deptno) not found in any table in the query (or SLV is undefined).
Error in line 2
Near character position 0
Similarly, you cannot query any columns that appear in the
pivot_clause
. For example, attempting to query thesal1
column will also result in an error.Columns in the
pivot_clause
must use aggregate functions. Not using an aggregate function will cause an error.
Example:
SELECT * FROM emp PIVOT(sal1 FOR deptno IN (10, 20, 30, 40));
201: A syntax error has occurred.
- In the
pivot_in_clause
, if an alias is specified, use the alias; if not, use the value as the column name. In thepivot_clause
, if an alias is specified, it is appended to the corresponding transformed column name frompivot_in_clause
; if not, it is omitted. When there are multiple aggregate functions in thepivot_clause
, GBase 8s allows you not to set aliases; it will automatically append_1
,_2
, etc., in sequence. Aliases do not support single quotes.
Example:
> SELECT * FROM (SELECT deptno,job, sal1,sal2 FROM emp)a
PIVOT(SUM(sal1),SUM(sal2) FOR (deptno, job) IN ((10, 'clerk'),(20, 'manager')));
Result:
10_clerk 20_manager 10_clerk_1 20_manager_1
800.00 5825.00 910.00 2532.34
1 row(s) retrieved.
Through this introduction, you should now have a comprehensive understanding of the Pivot function in GBase 8s. It's not just a simple data transformation tool but a powerful asset for data analysis. In the next article, we will discuss the Unpivot function. Thank you for reading!
Top comments (0)