PopSQL

How to use SQL Pivot function

Transforming rows into columns using SQL Pivot

SQL Pivot is a powerful operation that allows you to transform rows of data into columns, making it easier to analyze and present data in a more readable format. SQL Pivot helps analysts by improving readability, aggregation, simplifying reporting and data analysis, reducing query complexity and streamlining ETL processes. This transformation is especially useful when you want to summarize and present your data in a more readable and structured way.

Getting started with SQL Pivot

Let us review the general methodology and syntax for defining a SQL pivot statement. Each relational database system implements the syntax slightly differently. To show examples of pivot operation, we will use thecustomers table below with the following columns:

  • Subscription_plan: A varchar field which represents the type of subscription.
  • Subscribed_customers: an integer field which represents the number of subscribed customers
  • Date: a date field which represents the subscription date
Subscription_plan Subscribed_customers Date
1 Premium 2 2023-06-01
2 Business 1 2023-06-01
3 Enterprise 0 2023-06-01
4 Premium 1 2023-06-02
5 Business 3 2023-06-02
6 Enterprise 1 2023-06-02
7 Premium 2 2023-06-03
8 Business 0 2023-06-03
9 Enterprise 0 2023-06-03
10 Premium 1 2023-06-04
11 Business 0 2023-06-04
12 Enterprise 1 2023-06-04
13 Premium 2 2023-06-05
14 Business 1 2023-06-05
15 Enterprise 1 2023-06-05
16 Premium 3 2023-06-06
17 Business 2 2023-06-06
18 Enterprise 0 2023-06-06
19 Premium 1 2023-06-07
20 Business 2 2023-06-07
21 Enterprise 0 2023-06-07

Pivot in SQL Server

In SQL Server, the pivot query has:

  • The select statement can reference the pivoted fields.
  • Inner query defines the source data for the pivot statement.
  • The pivot clause applies an aggregation and a filter for statements.
  • Both the inner query and the pivoted data are aliased.
SELECT
FROM (INNER QUERY) as INNER QUERY
PIVOT
(
  AGGREGATE(FIELD)
  FOR FIELD IN([FIELD1],[FIELD2],[FIELD3])
) AS PIVOTED DATA

This section outlines the syntax to pivot the customers table data with SQL Server:

  SELECT Subscription_plan,
  [2023-06-01],
  [2023-06-02],
  [2023-06-03],
  [2023-06-04],
  [2023-06-05],
  [2023-06-06],
  [2023-06-07]

  FROM
  (SELECT Subscription_plan,[Subscribed_customers], DATE FROM customers) AS IQ
  PIVOT
  (
  SUM([Subscribed_customers]) 
  FOR DATE 
    IN(
      [2023-06-01],[2023-06-02],[2023-06-03],[2023-06-04],
      [2023-06-05],[2023-06-06],[2023-06-07]
    )
  ) AS PT

Data after the pivot:

Subscription_plan 2023-06-01 2023-06-02 2023-06-03 2023-06-04 2023-06-05 2023-06-06 2023-06-07
Business 1 3 0 0 1 2 2
Enterprise 0 1 0 1 1 0 0
Premium 2 1 2 1 2 3 1

In this example:

  • The 21 table rows reduce to 3 rows, one for each Subscription_plan .
  • The date fields previously in the date column are pivoted to the column headers.
  • The integer values are aggregated and show the number of subscribed customers per day.

The pivoted data is much simpler to consume and easier to look at.

Pivot in MySQL

In MySQL, the pivot operation is different from the SQL server:

  • Since there is no supported pivot operator, native case when and aggregation syntax is used.
  • The aggregation is conditionally applied causing the fields to pivot.
  • An outer group by expression must be applied for columns outside the aggregation.
SELECT
FIELD1,
SUM(CASE WHEN CRITERIA THEN completed ELSE 0 END) AS FIELDNAME,
GROUP BY FIELD1

This section outlines the native syntax to pivot the customers table data with MySQL.

MySQL does not have a pivot operator; however, you can still show the results as in a pivot table with a native SQL query. To do this, you need to use aselect statement in combination with case when and an aggregation operator. In this example, the aggregation operation is the sum operator and the dates are parsed with theSTR_TO_DATE function with the appropriate parsing string:

SELECT
Subscription_plan,
SUM(CASE
  WHEN Date = STR_TO_DATE('06/01/2023','%m/%d/%Y') THEN Subscribed_customers ELSE 0 END
) AS '06/01/2023',
SUM(CASE
  WHEN Date = STR_TO_DATE('06/02/2023','%m/%d/%Y') THEN Subscribed_customers ELSE 0 END
) AS '06/02/2023',
SUM(CASE
  WHEN Date = STR_TO_DATE('06/03/2023','%m/%d/%Y') THEN Subscribed_customers ELSE 0 END
) AS '06/03/2023',
SUM(CASE
  WHEN Date = STR_TO_DATE('06/04/2023','%m/%d/%Y') THEN Subscribed_customers ELSE 0 END
) AS '06/04/2023',
SUM(CASE
  WHEN Date = STR_TO_DATE('06/05/2023','%m/%d/%Y') THEN Subscribed_customers ELSE 0 END
) AS '06/05/2023',
SUM(CASE
  WHEN Date = STR_TO_DATE('06/06/2023','%m/%d/%Y') THEN Subscribed_customers ELSE 0 END
) AS '06/06/2023',
SUM(CASE
  WHEN Date = STR_TO_DATE('06/07/2023','%m/%d/%Y') THEN Subscribed_customers ELSE 0 END
) AS '06/07/2023'
FROM customers
GROUP BY Subscription_plan;

The results from this query are equivalent to the Pivot syntax from the SQL Server example.

Pivot in PostgreSQL

In PostgreSQL, here’s how the pivot tables are created:

  • The crosstab operator is used to pivot data.
  • Similar to the SQL Server pivot implementation, you specify an inner query.
  • You specify the data types and field names in the second set of parentheses.
SELECT * 
FROM crosstab('INNER QUERY') AS ct (
  FIELD1 AS varchar(50)
);

This section outlines the syntax to pivot the customers table data with PostgreSQL.

In PostgreSQL, to get a pivoted output, you use thecrosstab operator.Crosstab will produce a pivoted dataset with a slightly different syntax than the previous pivot statements.

To enable the pivot table in PostgreSQL, you must create the tablefunc module in your database:

CREATE EXTENSION IF NOT EXISTS tablefunc;
SELECT * 
FROM crosstab('select Subscription_plan, Date, Subscribed_customers from customers ORDER BY 1,2') AS ct (
Subscription_plan varchar(50),
"06/01/2023" int,
"06/02/2023" int,
"06/03/2023" int,
"06/04/2023" int,
"06/05/2023" int,
"06/06/2023" int,
"06/07/2023" int
);

Note: Column names and data types must be specified for the cross tab alias. In this case, each date is represented as a string value of the date such as “06/01/2023” with an integer data type.

Pivot in Snowflake

The Snowflake syntax follows a similar structure to the SQL Server implementation:

  • An inner query is defined.
  • Pivoted fields are explicitly defined in theIN statement.
SELECTFROM BASIC
PIVOT(AGGREGATE(FIELD)
FOR FIELD
IN ([FIELD1],[FIELD2],[FIELD3]))
AS p

The Pivot operator is available in Snowflake, and has a syntax similar to SQL Server. Note the tick and double quotes around the top query that are used to select and rename the column headers:

SELECT Subscription_plan,
"'06/01/2023'" AS "06/01/2023",
"'06/02/2023'" AS "06/02/2023",
"'06/03/2023'" AS "06/03/2023",
"'06/04/2023'" AS "06/04/2023",
"'06/05/2023'" AS "06/05/2023",
"'06/06/2023'" AS "06/06/2023",
"'06/07/2023'" AS "06/07/2023"

  FROM customers
	PIVOT(SUM(Subscribed_customers)
	FOR DATE
	IN ('06/01/2023', '06/02/2023', '06/03/2023',
	'06/04/2023', '06/05/2023', '06/06/2023', '06/07/2023'))
  	AS p

Here’s the result table:

Subscription_plan 2023-06-01 2023-06-02 2023-06-03 2023-06-04 2023-06-05 2023-06-06 2023-06-07
Business 1 3 0 0 1 2 2
Enterprise 0 1 0 1 1 0 0
Premium 2 1 2 1 2 3 1

You can also rename your columns easily with the Snowflake Pivot statement:

SELECT *
  FROM customers
	PIVOT(SUM(Subscribed_customers)
	FOR DATE
	IN ('06/01/2023', '06/02/2023', '06/03/2023',
	'06/04/2023', '06/05/2023', '06/06/2023', '06/07/2023'))
AS p (Subscription_plan, Date_1, Date_2, Date_3, Date_4, Date_5, Date_6, Date_7)
Subscription_plan Date_1 Date_2 Date_3 Date_4 Date_5 Date_6 Date_7
Business 1 3 0 0 1 2 2
Enterprise 0 1 0 1 1 0 0
Premium 2 1 2 1 2 3 1

Advanced SQL Pivoting techniques: Dynamic pivot statements

This section outlines advanced syntax to pivot the customers table data with SQL Server stored procedures and dynamic SQL.

Up to this point, we have used static queries to generate our pivot table statements. This is limiting if you are pivoting large amounts of data. In ourcustomers table, we only have seven separate dates, so the cardinality is low. But imagine if you had hundreds or thousands of dates. That would make the Pivot syntax up to this point unmanageable. In some relational systems, you can create dynamic SQL to help automate the pivot syntax. When using dynamic SQL, you do not need to specify each column in your Pivot statement but instead allow the SQL engine to generate the syntax.

To facilitate the dynamic Pivot operation in SQL Server, we will use a stored procedure. A stored procedure is a database object that can be used to execute procedural SQL code. Stored procedures are a great way to store calculations and complex reporting definitions. In this example, we will use a stored procedure to pass various parameters to make a dynamic pivot statement.

CREATE PROC SPX_SUM_PIVOT_TABLE(
  	@INNERCOLUMNS VARCHAR(MAX),
      @TABLE VARCHAR(MAX),
      @SUMCOLUMN VARCHAR(MAX),
      @FORCOLUMN VARCHAR(MAX),
      @INCOLUMNS VARCHAR(MAX)
  )
  AS
  BEGIN
    DECLARE @SQL VARCHAR(MAX)
    SET @SQL = 'SELECT * FROM ('
    + 'SELECT ' + @INNERCOLUMNS
    + ' FROM ' + @TABLE + ' ) AS T '
    + ' PIVOT ( SUM(' + @SUMCOLUMN + ')'
    + ' FOR [' + @FORCOLUMN + ']'
    + ' IN (' + @INCOLUMNS + ')) AS PT'
    PRINT @SQL
    EXEC(@SQL)
  END

In this stored procedure:

  • All parameters are varchar data types.
  • The@INNERCOLUMNS parameter holds columns which are selected in theINNER SELECT statement.
  • The@TABLE parameter holds the table name from the source table, in this casecustomers.
  • The@SUMCOLUMN holds the column which will be aggregated when the stored procedure executes.
  • The@FORCOLUMN holds the column that will be pivoted and translated to header rows.
  • The@INCOLUMNS holds the inner filter logic and holds a square block comma separated list of the columns which will be filtered for the pivot statement.

All of these parameters can now be passed to the stored procedure. When passed the values, the stored procedure will:

  • Print the dynamic SQL statement. You can see this in the console of the workbench application you are using. This is considered a best practice because dynamic SQL is tricky to debug.
  • Execute the dynamic SQL statement and return the same results as the previous SQL server pivot example.

Execute the stored procedure with the following syntax:

EXEC SPX_SUM_PIVOT_TABLE
  '[Subscription_plan],[Subscribed_customers],[DATE]',
  customers,
  'Subscribed_customers',
  'DATE',
'[2023-06-01],[2023-06-02],[2023-06-03],[2023-06-04],[2023-06-05],[2023-06-06],[2023-06-07]'

Print SQL results:

SELECT* FROM (SELECT [Subscription_plan],[Subscribed_customers],[DATE] FROM customers)
AS T PIVOT(SUM(Subscribed_customers) FOR (Date)
IN ([2023-06-01],[2023-06-02],[2023-06-03],[2023-06-04],[2023-06-05],[2023-06-06],[2023-06-07])

(3 rows affected)
Completion time: 2023-06-30T14:35:52.3489587-04:00

Execute SQL results:

Subscription_plan 2023-06-01 2023-06-02 2023-06-03 2023-06-04 2023-06-05 2023-06-06 2023-06-07
Business 1 3 0 0 1 2 2
Enterprise 0 1 0 1 1 0 0
Premium 2 1 2 1 2 3 1

The advantages of using a stored procedure for pivot can be summarized in the following:

  • The code is dynamic and, therefore, can be reused on similar pivot use cases.
  • You can further abstract the usage to make the parameters of the stored procedures dynamic. For instance, building a string of dates for the @INCOLUMNS parameters.
  • You can create the stored procedure in the system model database and it will be automatically created in all subsequent databases.

stored procedure

Fixing SQL Pivot errors

Errors in SQL Pivot typically occur due to issues with the query syntax, the data being pivoted, or the aggregation functions used. To debug SQL Pivot errors effectively, break down your query step by step, review the error messages provided by your database system, and use tools like PRINT orSELECT statements to inspect intermediate results. Validate your data, ensure consistent column aliases, and double-check your Pivot clause and aggregation functions. If you continue to encounter errors, consult your database system's documentation or seek assistance from a database administrator or developer. Let us look at some common error statements and how to fix them.

Incorrect syntax

These errors occur when there is a problem with the SQL syntax used in your Pivot query. Examples include missing commas, parentheses, or incorrect use of keywords.

Incorrect syntax near 'PIVOT'. Check your SQL syntax.

Carefully review your SQL syntax to ensure it follows the correct format for your database system. Pay attention to commas, parentheses, and aliases.

Invalid column name

Errors can occur if the column names or values specified in the FOR clause of your Pivot query do not match the actual values in your data.

Invalid column name 'Month'. Ensure the column name is correct.

Double-check that the values in the FOR clause match the unique values you expect to pivot. Verify that there are no typos or inconsistencies.

Missing aggregation function

If you forget to include an aggregation function when using aggregation (e.g., SUM, COUNT) in your Pivot query, you may encounter an error.

Missing aggregation function in PIVOT clause. Specify an aggregation function.

Ensure that you include the appropriate aggregation function within the Pivot clause to specify how to aggregate data.

Invalid Join condition

If your Pivot operation involves multiple tables, missing or incorrect join conditions can lead to errors.

Invalid join condition. Check your table joins and relationships.

Verify that your join conditions are correctly specified, and the tables are properly joined to provide the required data for the Pivot operation.

Invalid Group By clause

These are the errors related to issues with grouping or filtering of data within the Pivot clause.

Invalid GROUP BY clause in PIVOT query. Review your grouping conditions.

Confirm that the grouping and filtering conditions in your Pivot clause are correctly applied. Review your GROUP BY clause if necessary.

Pivoting for simplified data analysis

In this tutorial, we covered how to use Pivot in SQL, various use cases, and syntax behind this operator. We also covered various situations–mainly centered around reporting as the motivation–with code examples for using the Pivot operator. We covered several different databases (SQL Server, MySQL, PostgreSQL, and Snowflake), and the implementations in each that allow you to pivot data.

Recall that SQL Server implements Pivot operators. In MySQL, neither the crosstab operator nor the pivot table operator is offered but you can still achieve a pivot with native SQL syntax. PostgreSQL does not offer a Pivot operator but instead has a crosstab operator. Similar to SQL Server, Snowflake Pivot operator to facilitate pivoting operations.

Finally, we covered an advanced scenario in SQL server. We used dynamic SQL and a stored procedure that accepted various parameters that allowed us to pivot data dynamically.

The pivot operation is a powerful tool; you can always perform a pivot operation with native SQL, like in the MySQL example above. Knowing how to pivot data can help with generating reports and making massive amounts of data more digestible.

Get ready for analysis

Once you pivot your data and represent it is a more understandable way, it is time to try some analysis. As a next step to this tutorial, feel free to explore the tutorial How to Calculate Cumulative Sum-Running Total in SQL Server.

FAQs

What is the difference between PIVOT and UNPIVOT in SQL?

PIVOT is used to transform rows into columns, while UNPIVOT is used to transform columns into rows, effectively reversing the pivot operation.

In which database systems can I use SQL Pivot?

SQL Pivot is supported in various database systems, including SQL Server, Oracle, Snowflake, and others, but the syntax and implementation may vary between systems.

Can I pivot data dynamically in SQL?

Some database systems support dynamic pivoting, where you can pivot data based on values in the source table, allowing for flexibility in column creation.

What should I do if I encounter errors while using SQL Pivot?

When encountering errors, carefully review your SQL syntax, ensure column names and values in the FOR clause are correct, and check for data type mismatches. Debugging tools and reviewing error messages can help pinpoint issues.

Can I pivot data in all SQL database systems?

While SQL Pivot is not a standard SQL operation, many database systems support pivoting through custom queries and constructs. However, the specific syntax and features may vary between systems.

How do I handle NULL values when pivoting data?

You can use the COALESCE function or other conditional logic to handle NULL values and provide default values or calculations when necessary during the pivot operation.

Are there any performance considerations when using SQL Pivot?

Pivoting large datasets or performing complex pivot operations may impact query performance. Ensure your database is properly indexed, and consider optimizing your queries for efficiency.

database icon
From SQL Server query to chart to Slack in seconds
Get to answers faster, together, with PopSQL and SQL Server