How to use SQL Pivot function
Transforming rows into columns using SQL Pivot
SQL Pivot is a powerful operation that allows you to transform rows of data into columns, making it easier to analyze and present data in a more readable format. SQL Pivot helps analysts by improving readability, aggregation, simplifying reporting and data analysis, reducing query complexity and streamlining ETL processes. This transformation is especially useful when you want to summarize and present your data in a more readable and structured way.
Getting started with SQL Pivot
Let us review the general methodology and syntax for defining a SQL pivot statement. Each relational database system implements the syntax slightly differently. To show examples of pivot operation, we will use the
customers table below with the following columns:
Subscription_plan: A varchar field which represents the type of subscription.
Subscribed_customers: an integer field which represents the number of subscribed customers
Date: a date field which represents the subscription date
Pivot in SQL Server
In SQL Server, the pivot query has:
selectstatement can reference the pivoted fields.
Inner querydefines the source data for the pivot statement.
pivotclause applies an aggregation and a filter for statements.
- Both the inner query and the pivoted data are aliased.
SELECT FROM (INNER QUERY) as INNER QUERY PIVOT ( AGGREGATE(FIELD) FOR FIELD IN([FIELD1],[FIELD2],[FIELD3]) ) AS PIVOTED DATA
This section outlines the syntax to pivot the
customers table data with SQL Server:
SELECT Subscription_plan, [2023-06-01], [2023-06-02], [2023-06-03], [2023-06-04], [2023-06-05], [2023-06-06], [2023-06-07] FROM (SELECT Subscription_plan,[Subscribed_customers], DATE FROM customers) AS IQ PIVOT ( SUM([Subscribed_customers]) FOR DATE IN( [2023-06-01],[2023-06-02],[2023-06-03],[2023-06-04], [2023-06-05],[2023-06-06],[2023-06-07] ) ) AS PT
Data after the pivot:
In this example:
- The 21 table rows reduce to 3 rows, one for each
- The date fields previously in the date column are pivoted to the column headers.
- The integer values are aggregated and show the number of subscribed customers per day.
The pivoted data is much simpler to consume and easier to look at.
Pivot in MySQL
In MySQL, the pivot operation is different from the SQL server:
- Since there is no supported pivot operator, native
case whenand aggregation syntax is used.
- The aggregation is conditionally applied causing the fields to pivot.
- An outer
group byexpression must be applied for columns outside the aggregation.
SELECT FIELD1, SUM(CASE WHEN CRITERIA THEN completed ELSE 0 END) AS FIELDNAME, GROUP BY FIELD1
This section outlines the native syntax to pivot the
customers table data with MySQL.
MySQL does not have a pivot operator; however, you can still show the results as in a pivot table with a native SQL query. To do this, you need to use a
select statement in combination with
case when and an aggregation operator. In this example, the aggregation operation is the
sum operator and the dates are parsed with the
STR_TO_DATE function with the appropriate parsing string:
SELECT Subscription_plan, SUM(CASE WHEN Date = STR_TO_DATE('06/01/2023','%m/%d/%Y') THEN Subscribed_customers ELSE 0 END ) AS '06/01/2023', SUM(CASE WHEN Date = STR_TO_DATE('06/02/2023','%m/%d/%Y') THEN Subscribed_customers ELSE 0 END ) AS '06/02/2023', SUM(CASE WHEN Date = STR_TO_DATE('06/03/2023','%m/%d/%Y') THEN Subscribed_customers ELSE 0 END ) AS '06/03/2023', SUM(CASE WHEN Date = STR_TO_DATE('06/04/2023','%m/%d/%Y') THEN Subscribed_customers ELSE 0 END ) AS '06/04/2023', SUM(CASE WHEN Date = STR_TO_DATE('06/05/2023','%m/%d/%Y') THEN Subscribed_customers ELSE 0 END ) AS '06/05/2023', SUM(CASE WHEN Date = STR_TO_DATE('06/06/2023','%m/%d/%Y') THEN Subscribed_customers ELSE 0 END ) AS '06/06/2023', SUM(CASE WHEN Date = STR_TO_DATE('06/07/2023','%m/%d/%Y') THEN Subscribed_customers ELSE 0 END ) AS '06/07/2023' FROM customers GROUP BY Subscription_plan;
The results from this query are equivalent to the Pivot syntax from the SQL Server example.
Pivot in PostgreSQL
In PostgreSQL, here’s how the pivot tables are created:
crosstab operatoris used to pivot data.
- Similar to the SQL Server pivot implementation, you specify an
- You specify the data types and field names in the second set of parentheses.
SELECT * FROM crosstab('INNER QUERY') AS ct ( FIELD1 AS varchar(50) );
This section outlines the syntax to pivot the
customers table data with PostgreSQL.
In PostgreSQL, to get a pivoted output, you use the
Crosstab will produce a pivoted dataset with a slightly different syntax than the previous pivot statements.
To enable the pivot table in PostgreSQL, you must create the
tablefunc module in your database:
CREATE EXTENSION IF NOT EXISTS tablefunc;
SELECT * FROM crosstab('select Subscription_plan, Date, Subscribed_customers from customers ORDER BY 1,2') AS ct ( Subscription_plan varchar(50), "06/01/2023" int, "06/02/2023" int, "06/03/2023" int, "06/04/2023" int, "06/05/2023" int, "06/06/2023" int, "06/07/2023" int );
Note: Column names and data types must be specified for the cross tab alias. In this case, each date is represented as a string value of the date such as “06/01/2023” with an integer data type.
Pivot in Snowflake
The Snowflake syntax follows a similar structure to the SQL Server implementation:
- An inner query is defined.
- Pivoted fields are explicitly defined in the
SELECT … FROM BASIC PIVOT(AGGREGATE(FIELD) FOR FIELD IN ([FIELD1],[FIELD2],[FIELD3])) AS p
Pivot operator is available in Snowflake, and has a syntax similar to SQL Server. Note the tick and double quotes around the top query that are used to select and rename the column headers:
SELECT Subscription_plan, "'06/01/2023'" AS "06/01/2023", "'06/02/2023'" AS "06/02/2023", "'06/03/2023'" AS "06/03/2023", "'06/04/2023'" AS "06/04/2023", "'06/05/2023'" AS "06/05/2023", "'06/06/2023'" AS "06/06/2023", "'06/07/2023'" AS "06/07/2023" FROM customers PIVOT(SUM(Subscribed_customers) FOR DATE IN ('06/01/2023', '06/02/2023', '06/03/2023', '06/04/2023', '06/05/2023', '06/06/2023', '06/07/2023')) AS p
Here’s the result table:
You can also rename your columns easily with the Snowflake
SELECT * FROM customers PIVOT(SUM(Subscribed_customers) FOR DATE IN ('06/01/2023', '06/02/2023', '06/03/2023', '06/04/2023', '06/05/2023', '06/06/2023', '06/07/2023')) AS p (Subscription_plan, Date_1, Date_2, Date_3, Date_4, Date_5, Date_6, Date_7)
Advanced SQL Pivoting techniques: Dynamic pivot statements
This section outlines advanced syntax to pivot the
customers table data with SQL Server stored procedures and dynamic SQL.
Up to this point, we have used static queries to generate our pivot table statements. This is limiting if you are pivoting large amounts of data. In our
customers table, we only have seven separate dates, so the cardinality is low. But imagine if you had hundreds or thousands of dates. That would make the Pivot syntax up to this point unmanageable. In some relational systems, you can create dynamic SQL to help automate the pivot syntax. When using dynamic SQL, you do not need to specify each column in your Pivot statement but instead allow the SQL engine to generate the syntax.
To facilitate the dynamic Pivot operation in SQL Server, we will use a stored procedure. A stored procedure is a database object that can be used to execute procedural SQL code. Stored procedures are a great way to store calculations and complex reporting definitions. In this example, we will use a stored procedure to pass various parameters to make a dynamic pivot statement.
CREATE PROC SPX_SUM_PIVOT_TABLE( @INNERCOLUMNS VARCHAR(MAX), @TABLE VARCHAR(MAX), @SUMCOLUMN VARCHAR(MAX), @FORCOLUMN VARCHAR(MAX), @INCOLUMNS VARCHAR(MAX) ) AS BEGIN DECLARE @SQL VARCHAR(MAX) SET @SQL = 'SELECT * FROM (' + 'SELECT ' + @INNERCOLUMNS + ' FROM ' + @TABLE + ' ) AS T ' + ' PIVOT ( SUM(' + @SUMCOLUMN + ')' + ' FOR [' + @FORCOLUMN + ']' + ' IN (' + @INCOLUMNS + ')) AS PT' PRINT @SQL EXEC(@SQL) END
In this stored procedure:
- All parameters are
@INNERCOLUMNSparameter holds columns which are selected in the
@TABLEparameter holds the table name from the source table, in this case
@SUMCOLUMNholds the column which will be aggregated when the stored procedure executes.
@FORCOLUMNholds the column that will be pivoted and translated to header rows.
@INCOLUMNSholds the inner filter logic and holds a square block comma separated list of the columns which will be filtered for the pivot statement.
All of these parameters can now be passed to the stored procedure. When passed the values, the stored procedure will:
- Print the dynamic SQL statement. You can see this in the console of the workbench application you are using. This is considered a best practice because dynamic SQL is tricky to debug.
- Execute the dynamic SQL statement and return the same results as the previous SQL server pivot example.
Execute the stored procedure with the following syntax:
EXEC SPX_SUM_PIVOT_TABLE '[Subscription_plan],[Subscribed_customers],[DATE]', customers, 'Subscribed_customers', 'DATE', '[2023-06-01],[2023-06-02],[2023-06-03],[2023-06-04],[2023-06-05],[2023-06-06],[2023-06-07]'
Print SQL results:
SELECT* FROM (SELECT [Subscription_plan],[Subscribed_customers],[DATE] FROM customers) AS T PIVOT(SUM(Subscribed_customers) FOR (Date) IN ([2023-06-01],[2023-06-02],[2023-06-03],[2023-06-04],[2023-06-05],[2023-06-06],[2023-06-07]) (3 rows affected) Completion time: 2023-06-30T14:35:52.3489587-04:00
Execute SQL results:
The advantages of using a stored procedure for pivot can be summarized in the following:
- The code is dynamic and, therefore, can be reused on similar pivot use cases.
- You can further abstract the usage to make the parameters of the stored procedures dynamic. For instance, building a string of dates for the
- You can create the stored procedure in the system model database and it will be automatically created in all subsequent databases.
Fixing SQL Pivot errors
Errors in SQL Pivot typically occur due to issues with the query syntax, the data being pivoted, or the aggregation functions used. To debug SQL Pivot errors effectively, break down your query step by step, review the error messages provided by your database system, and use tools like
SELECT statements to inspect intermediate results. Validate your data, ensure consistent column aliases, and double-check your Pivot clause and aggregation functions. If you continue to encounter errors, consult your database system's documentation or seek assistance from a database administrator or developer. Let us look at some common error statements and how to fix them.
These errors occur when there is a problem with the SQL syntax used in your Pivot query. Examples include missing commas, parentheses, or incorrect use of keywords.
Incorrect syntax near 'PIVOT'. Check your SQL syntax.
Carefully review your SQL syntax to ensure it follows the correct format for your database system. Pay attention to commas, parentheses, and aliases.
Invalid column name
Errors can occur if the column names or values specified in the FOR clause of your Pivot query do not match the actual values in your data.
Invalid column name 'Month'. Ensure the column name is correct.
Double-check that the values in the FOR clause match the unique values you expect to pivot. Verify that there are no typos or inconsistencies.
Missing aggregation function
If you forget to include an aggregation function when using aggregation (e.g., SUM, COUNT) in your Pivot query, you may encounter an error.
Missing aggregation function in PIVOT clause. Specify an aggregation function.
Ensure that you include the appropriate aggregation function within the Pivot clause to specify how to aggregate data.
Invalid Join condition
If your Pivot operation involves multiple tables, missing or incorrect join conditions can lead to errors.
Invalid join condition. Check your table joins and relationships.
Verify that your join conditions are correctly specified, and the tables are properly joined to provide the required data for the Pivot operation.
Invalid Group By clause
These are the errors related to issues with grouping or filtering of data within the Pivot clause.
Invalid GROUP BY clause in PIVOT query. Review your grouping conditions.
Confirm that the grouping and filtering conditions in your Pivot clause are correctly applied. Review your
GROUP BY clause if necessary.
Pivoting for simplified data analysis
In this tutorial, we covered how to use Pivot in SQL, various use cases, and syntax behind this operator. We also covered various situations–mainly centered around reporting as the motivation–with code examples for using the Pivot operator. We covered several different databases (SQL Server, MySQL, PostgreSQL, and Snowflake), and the implementations in each that allow you to pivot data.
Recall that SQL Server implements Pivot operators. In MySQL, neither the crosstab operator nor the pivot table operator is offered but you can still achieve a pivot with native SQL syntax. PostgreSQL does not offer a Pivot operator but instead has a crosstab operator. Similar to SQL Server, Snowflake Pivot operator to facilitate pivoting operations.
Finally, we covered an advanced scenario in SQL server. We used dynamic SQL and a stored procedure that accepted various parameters that allowed us to pivot data dynamically.
The pivot operation is a powerful tool; you can always perform a pivot operation with native SQL, like in the MySQL example above. Knowing how to pivot data can help with generating reports and making massive amounts of data more digestible.
Get ready for analysis
Once you pivot your data and represent it is a more understandable way, it is time to try some analysis. As a next step to this tutorial, feel free to explore the tutorial How to Calculate Cumulative Sum-Running Total in SQL Server.
What is the difference between PIVOT and UNPIVOT in SQL?
PIVOT is used to transform rows into columns, while
UNPIVOT is used to transform columns into rows, effectively reversing the pivot operation.
In which database systems can I use SQL Pivot?
SQL Pivot is supported in various database systems, including SQL Server, Oracle, Snowflake, and others, but the syntax and implementation may vary between systems.
Can I pivot data dynamically in SQL?
Some database systems support dynamic pivoting, where you can pivot data based on values in the source table, allowing for flexibility in column creation.
What should I do if I encounter errors while using SQL Pivot?
When encountering errors, carefully review your SQL syntax, ensure column names and values in the FOR clause are correct, and check for data type mismatches. Debugging tools and reviewing error messages can help pinpoint issues.
Can I pivot data in all SQL database systems?
While SQL Pivot is not a standard SQL operation, many database systems support pivoting through custom queries and constructs. However, the specific syntax and features may vary between systems.
How do I handle NULL values when pivoting data?
You can use the COALESCE function or other conditional logic to handle
NULL values and provide default values or calculations when necessary during the pivot operation.
Are there any performance considerations when using SQL Pivot?
Pivoting large datasets or performing complex pivot operations may impact query performance. Ensure your database is properly indexed, and consider optimizing your queries for efficiency.