CHAPTER 16
Intermediate
Subqueries and Common Table Expressions (CTEs)
Updated: May 16, 2026
6 min read
# CHAPTER 16
Subqueries and Common Table Expressions (CTEs)
1. Introduction
Imagine the CEO asks: *"Give me a list of all employees who make more money than the company average."* You cannot typeWHERE salary > AVG(salary) because aggregate functions are not allowed in the WHERE clause. You must first run a query to calculate the average, and then run a second query to find the employees.
What if you could combine them? A Subquery is a query nested entirely inside another query. In this chapter, we will master nesting queries, and learn how to modernize them using Common Table Expressions (CTEs).
2. Learning Objectives
By the end of this chapter, you will be able to:-
Write basic Subqueries in the
WHEREclause.
- Understand the performance risks of Correlated Subqueries.
-
Use the
EXISTSoperator for lightning-fast checks.
-
Refactor messy Subqueries into clean
WITHclauses (CTEs).
- Understand how CTEs improve query readability.
3. The Basic Subquery
A subquery is wrapped in parentheses() and is executed *first* by PostgreSQL. The result is then fed into the outer query.
Let's solve the CEO's problem:
sql
4. Subqueries with the IN Operator
Subqueries can return entire lists of data to be used with the IN operator.
sql
5. Correlated Subqueries (The Performance Killer)
In a standard subquery, the inner query runs exactly ONE time. A Correlated Subquery contains a reference to the outer query. This forces PostgreSQL to re-run the inner query *for every single row* in the outer table!
sql
*(If you have 100,000 employees, this query executes the inner average calculation 100,000 times! Avoid these when possible).*
6. The EXISTS Operator
When you just need to know IF data exists (not what the data is), EXISTS is the fastest operator in PostgreSQL. It stops searching the millisecond it finds a single match.
sql
7. Common Table Expressions (CTEs)
Subqueries get incredibly ugly and hard to read when deeply nested. Modern PostgreSQL developers use CTEs (theWITH clause).
A CTE acts like a temporary, virtual table that exists only for the millisecond your query is running. It allows you to write the subquery at the very top, give it a name, and cleanly select from it later.
sql
8. Mini Project: Complex Reporting with CTEs
A manager wants to see a list of Departments, their Total Revenue, and ONLY include departments that made over $100,000. CTEs make complex grouping highly readable.
sql
9. Common Mistakes
-
Returning Multiple Columns: A subquery used with an
=or>operator MUST return exactly ONE column and ONE row. If your subquery returns(firstname, lastname), the outer query will crash because it cannot do math against two columns simultaneously.
-
Overusing Subqueries instead of JOINs: Many beginners use subqueries to fetch related data. In almost all cases, an
INNER JOINis significantly faster and better optimized by the PostgreSQL engine.
10. Best Practices
-
Embrace CTEs: In modern data engineering, deeply nested subqueries are considered bad practice due to poor readability. If your query requires nesting, extract it into a
WITHCTE block at the top of the file.
11. Exercises
-
1.
Write a query using a subquery to find all
productsthat have a price greater than the absoluteMAX(price)found in the 'Toys' category.
-
2.
Explain why the
EXISTSoperator is faster thanINwhen dealing with massive datasets.
12. SQL Challenges
Convert the following nested subquery into a clean CTE using theWITH clause:
sql
Solution:
sql
13. MCQ Quiz with Answers
Question 1
What is the defining characteristic of a "Correlated Subquery"?
Question 2
What is the primary benefit of using a Common Table Expression (CTE) with the WITH clause instead of a traditional nested subquery?
14. Interview Questions
-
Q: Explain the performance differences between using
INversusEXISTSwhen filtering against a subquery that returns millions of rows.
- Q: What is a Recursive CTE in PostgreSQL, and in what specific data structure (e.g., an organizational chart) is it absolutely necessary?
15. FAQs
Q: Can I have multiple CTEs in one query? A: Yes! You can chain them using commas:WITH CTE1 AS (...), CTE2 AS (...) SELECT * FROM CTE1 JOIN CTE2...
16. Summary
Query nesting unlocks ultimate flexibility. By utilizing Subqueries, you can compute dynamic thresholds on the fly. By understanding the performance impacts of Correlated queries andEXISTS, you write faster software. Most importantly, by adopting CTEs, you elevate your SQL from messy nested scripts into clean, professional data pipelines.