Complex SQL Queries: Examples & Advanced Techniques
Complex queries in SQL don't need to be hard to write. Learn what they are, how to create them, and some examples in this article.
What is a complex SQL query?
When you start working with SQL, you'll learn about the
SELECT query to get data from tables. You'll then learn a range of techniques such as using the
WHERE keyword, joining to tables, and ordering your results.
Once you've got the basics, you'll start moving into more advanced queries.
A complex query in SQL is one that has more complexity and that you need to give some more thought to when you design and write it.
A complex query could include features such as:
- joining to many tables, and using different join types
- functions within functions
- group by and having
These kinds of queries take a bit more thought to understand how to design them and write them so they show the results you need, perform well, and are easy to understand.
How to build a complex SQL query
The main piece of advice I can offer for writing a complex query is that it's like writing many small queries put together.
You might be confused or overwhelmed by the need to write something pretty complex.
However, I suggest writing a small query first, and then adding to it piece-by-piece, until you get the final result you need.
So, start small. Write a simple query that just selects one or two columns from a table.
Run the query and see the result. Is it what you are expecting?
If so, update the query to add a little more to it. Perhaps add some more columns, or a
WHERE clause to filter data.
Run the query again and check the results. If they look good, keep going. If they don't, then review your query and see what may be causing the issue.
When you start to join to tables in your query, you may want to run a Select query on those tables first. This is so you can see the data in the table and to help you understand what your results might look like.
Once you can see the results from a simple Select in another table, then you may want to join it to your main query. Run the query, then check the results.
Keep going until you have the query and results that you need.
So, to write a complex query in SQL, it's easier to write it in smaller pieces and test it as you go by running the query and checking the results.
Advanced SQL Techniques for Complex Queries
In the previous section, we mentioned that a complex SQL query may include features such as joining multiple tables and subqueries.
There are a range of other features in SQL to make it easier to write SQL complex queries.
Let's look at a few of them now.
Common Table Expressions
A Common Table Expression (CTE) is a name you can give to a subquery within your main query.
The main reason you would do this is to simplify your query, making it easier to read and debug. It can sometimes improve performance, which is another benefit, but it's mostly about readability and simplification.
Here's a query with a subquery that does not have a Common Table Expression:
SELECT columns FROM table1 INNER JOIN ( SELECT columns FROM table2 WHERE conditions ) t2 WHERE conditions;
This same query could be updated to use a Common Table Expression. This would involve giving a name to the subquery. This named subquery could then be used as though it was a table in the main query:
WITH t2 AS ( SELECT columns FROM table2 WHERE conditions ) SELECT columns FROM table1 INNER JOIN t2 WHERE conditions;
This might not seem like much of an improvement.
However, there are a few ways it can help.
Moving the subquery to the top of the query and giving it a name will mean it's easier to understand what the query does. If your subquery selects all employees that have a higher-than-average salary, you could name your subquery something like
emp_above_avg_sal. When you refer to this in the main query, you'll see this name and will know what it refers to.
You can also read and update the logic in one place. The logic to calculate the results for this subquery is at the top of the query. You can often copy and paste this query to another tab and run it separately, making it easier to understand and even debug the query.
It's also helpful if you have a long query and need this logic in several places. You can define it once, at the top of the query, and refer to it many times throughout your main query.
So, consider using Common Table Expressions whenever you have a subquery as a way to improve the readability of your query.
Another helpful SQL feature is called "window functions".
In SQL, you have functions that calculate a value based on many other records, such as
COUNT, and return a single row. These are called aggregate functions.
You can use the
Group By clause with these functions to calculate numbers for each group that is shown.
Let's say you had a table of orders that looked like this:
Let's say you wanted to see the total amount of orders per day. You can use a
SUM and a
GROUP BY for this:
SELECT order_date, SUM(order_amount) FROM orders GROUP BY order_date;
Your results would be:
This works, but using the
GROUP BY in this way means that the records you display must match the way that the data is calculated.
You show one row for each date, and you calculate the sum for that date.
What if you wanted to see all orders and the sum for each date? Or, you wanted to see a running total: the sum of orders so far for that date?
You can do this with window functions.
A window function is a way to write an SQL function that lets you perform a calculation on a range of rows that's different to how you display the rows.
So, you might want a running total like this:
Your query could look like this:
SELECT id, order_date, order_amount, SUM(order_amount) OVER ( PARTITION BY order_date ORDER BY order_id ) FROM orders;
OVER clause means that the function is calculated over a range of rows, and not the entire set of results.
PARTITION BY clause refers to the column that's used to define the range of rows. In this example, all records with the same order_date are used for calculating the sum.
ORDER BY determines which records come first in the running total calculation. In this example, it's by
This query will give you the results including the running total.
So, if you need to calculate data using a different range to what is being displayed, you can use a window function. If it's something you were going to do in application code, consider using a window function to take advantage of the database.
The final technique that's common with complex SQL queries is a hierarchical query.
A hierarchical query, or recursive query, is where you write a query that links one record with other records in the same table.
This allows you to work with a table that stores different records where a record is a parent of another record.
Some common examples are:
- employees and managers, where the manager is an employee
- product categories and subcategories, which can have many levels
Rather than having many tables for each level, you can store all of your data in a single table and have a column that refers to the primary key of another record in the same table.
You can then write a query to show the data you need.
For example, you may have an employee table that has a column called
manager_id that refers to that employee's manager:
This allows many levels of employees and managers. Sarah has a
NULL indicating she is at the head of the organization.
You can use SQL to write a recursive query or hierarchical query to show the entire organization chart.
Here's a query to do that:
WITH empdata AS ( (SELECT employee_id, first_name, manager_id, 1 AS level FROM employee WHERE employee_id = 1) UNION ALL (SELECT this.employee_id, this.first_name, this.manager_id, prior.level + 1 FROM empdata prior INNER JOIN employee this ON this.manager_id = prior.employee_id) ) SELECT e.employee_id, e.first_name, e.manager_id, e.level FROM empdata e ORDER BY e.level;
This query uses a Common Table Expression to get a list of employees and their managers.
Within the CTE, called
empdata, we select the employee with an id of 1, which is the top of the organization.
We then combine this with all of the other employees, by selecting from the same query defined in the
WITH clause, joining on the manager id and employee id.
This ability to refer to a CTE from within the same CTE is what makes the query a recursive query.
If you're using Oracle or SQL Server, this will work. If you're using MySQL or Postgres, you'll need to add the word
RECURSIVE after the
WITH keyword to get it to work:
WITH RECURSIVE empdata AS ( …
You will then display a list of employees and their level in the organization.
You can also do some interesting things with this query, such as:
- adding the name of a person's manager
- indenting their name a number of spaces depending on a person's level, to make it look like a tree
So, a recursive or hierarchical query uses a range of advanced techniques such as Common Table Expressions, Unions, and self-joins to get the result you want in a specific type of database design.
Let's take a look at some more examples of complex queries in SQL, using some of the techniques above, as well as other techniques.
You can calculate the rank of a record based on one of its values by using the RANK function as a window function.
For example, this query calculates the rank of an order based on its order amount.
SELECT order_id, order_date, customer_id order_amount, RANK() OVER (ORDER BY order_amount DESC) FROM orders;
This will show all orders, and the
RANK column will show 1 for the highest order amount, 2 for the second highest, and so on. It shows the same values regardless of how you want to order the overall result.
Difference in Rows
Another example is using a window function called
LAG to find the difference of a value between two rows.
Without window functions, this would be difficult, but fortunately, we can use
LAG as a window function to find this out.
Let's say we want to display the total of orders for each month, and then show the difference of this total compared to the previous month.
Our query may look like this:
SELECT order_month, order_total, order_total - LAG(order_total) OVER (ORDER BY order_month) AS diff_last_month FROM orders;
LAG function lets you refer to the previous row, which is done using the
ORDER BY clause within the
You can use the
SUM function to add up number values. It's often used to add everything in a column.
However, it can also be used to add numbers based on certain criteria.
Let's say we wanted to show the total amount of orders in a month but separated based on paid status.
Here's a query that could be used:
SELECT o.order_month, SUM( CASE WHEN p.status = 'Paid' THEN o.order_amount ELSE 0 END ) AS total_amount_paid, SUM( CASE WHEN p.status = 'Pending' THEN o.order_amount ELSE 0 END ) AS total_amount_pending, SUM( CASE WHEN p.status = 'Default' THEN o.order_amount ELSE 0 END ) AS total_amount_default FROM orders o INNER JOIN payment_status p ON o.payment_status_id = p.id;
The result could be something like this:
You can use
SUM along with
CASE to create this kind of result where you only
SUM values that meet certain criteria.
Complex queries in PopSQL
There are several features within PopSQL that help you work with complex SQL queries.
A complex query in SQL can often include specifying parameter values, such as a value to check in the
WHERE clause or conditions for a
If you have this written in an SQL script file, and need to change the values, you'll need to update the query manually for all occurrences of this value, then run the query - and hopefully you don't break it along the way.
In PopSQL, there's a handy feature called SQL Variables. You can add a placeholder for a variable within your query and specify a value. If you need to change the value to be used in the query, you can simply change the value of the variable and run the query.
Here's an example of a query with a variable.
We can see the query has the variable within the SQL (called
country_code) and the variable at the top of the query outside the query itself. It has a value of "USA" but it can be changed.
You can change that value, and run the query again. This is much easier than manually editing the SQL script.
These variables can also be set to use a dropdown of specific values, so you can choose one from the list, reducing the chance of errors.
Share SQL Queries
Sometimes when you're working on a complex query, you'll need to involve another person. This could be a person within your team to get a second opinion, or a person in another team who's an expert in a certain area of the business, for example.
In PopSQL, you can easily share your query with other people. You don't need to email them a link or a link to a GitHub file.
When you have a query opened, you can click on the “Share” button to specify how you want to share the query.
You can select “Can View” if you want the other person to only be able to view the query. This can be helpful if you're presenting the query but don't need it to be edited, or if you only want someone to be able to run the query and see the results. If your query includes variables, the viewer can change those too.
You can select “Can Edit” if you want the other person to be able to view and edit the query. The other person can edit the query, run it, and essentially work on it with you to complete the query.
The SQL Editor within PopSQL has some handy features for working with complex queries.
Run Multiple Statements: You can write multiple statements in one query. This is common when working with complex queries, as you may be writing smaller queries as you build up to the final query, or take parts of a large query and run them independently.
If you have multiple statements, you can run them independently. You could also run them all at once by clicking the arrow next to the “Run” button and selecting “Run all statements”.
Now, whenever you click “Run”, all of the statements will be executed. If this is something you do often, this can save you time.
Query Descriptions: You can add descriptions to your queries to make it easier to identify them in your editor. This is helpful whether you only have a few queries or a lot of queries. I think this is better than adding an SQL comment to the start of the query because you can see the description as you scroll down the query.
Version History: PopSQL includes a version history of your queries. This is helpful as you write complex SQL queries and if you ever need to go back in time to a previous version.
PopSQL's editor includes many features you'll find in other SQL editors such as tabbed windows and syntax highlighting, but these features mentioned above are additional features that are helpful for working with complex queries.
Once you get comfortable with the basics of SQL, you'll start working with more complex queries that are longer and include more features such as subqueries and grouping. Getting better at writing and reading these queries comes with practice and understanding how they work. PopSQL can make it easier for you, whether you're writing them by yourself or working with a team of people.