Subqueries and Nested Queries
# CHAPTER 16
Subqueries and Nested Queries
1. Introduction
Sometimes, to answer a question, you have to ask another question first. *Boss:* "Show me a list of all employees who make more money than the company average." You cannot writeWHERE salary > AVG(salary) because aggregate functions are not allowed in the WHERE clause.
You must first ask: "What is the average?" (Answer: $60k). Then ask: "Who makes more than $60k?"
To do this automatically in a single command, we use a Subquery (a query nested inside another query). In this chapter, we master the art of SQL Inception.
2. Learning Objectives
By the end of this chapter, you will be able to:- Define a Subquery.
-
Use a Subquery in the
WHEREclause.
-
Use a Subquery with the
INoperator.
- Understand the performance mechanics of Correlated Subqueries.
-
Utilize the powerful
EXISTSoperator.
3. The Basic Subquery (Nested SELECT)
A Subquery is simply a SELECT statement wrapped in parentheses () and placed inside another query. The inner query executes *first*, and hands its answer to the outer query.
*This query is entirely dynamic! If salaries change tomorrow, the Subquery calculates the new average instantly.*
4. Subqueries with the IN Operator
If your inner query returns a list of multiple values (instead of a single number), you cannot use the = or > operators. You must use the IN operator.
*Goal: Find all users who have placed an order in the last 24 hours.*
*(Note: You could achieve this with an INNER JOIN, but a Subquery is often easier to read!)*
5. Correlated Subqueries (Advanced)
A standard Subquery runs exactly once, passes the answer to the outer query, and stops. A Correlated Subquery is a loop. It runs its inner query once for every single row in the outer query. It does this because the inner query specifically references data from the outer row!*Goal: Find employees whose salary is higher than the average salary of THEIR SPECIFIC DEPARTMENT.*
6. The EXISTS Operator
The EXISTS operator is used with Correlated Subqueries to check if a row simply exists in another table. It is incredibly fast because it stops searching the millisecond it finds a single match (it doesn't calculate totals).
*Goal: Give me a list of suppliers, but ONLY if we currently offer a product from them in our catalog.*
*(We use SELECT 1 because EXISTS only cares IF a row is returned, it doesn't care what the actual data is).*
7. Common Mistakes
-
Returning Multiple Columns: An inner query in a
WHEREclause can only return a single column. If you writeWHERE id IN (SELECT id, name FROM...), MySQL will crash because it cannot compareidagainst a list containing both IDs and Names.
-
Overusing Correlated Subqueries: Because a correlated subquery executes its inner logic for *every single row* in the outer table, it can be extremely slow. If the outer table has 1 million rows, the inner query runs 1 million times. If performance drops, rewrite the logic using a
JOIN.
8. Best Practices
-
Subqueries vs. JOINs: Most Subqueries with
INcan be rewritten asINNER JOINs. Historically, JOINs were much faster. In modern MySQL (Version 8+), the query optimizer is so smart that it automatically converts your Subqueries into JOINs behind the scenes, so you should write whichever version is easier for your human team to read!
9. Exercises
-
1.
Write a Subquery to find the
firstnameof the employee who has the absolute highest salary (MAX) in the company.
-
2.
Why must you use the
INoperator instead of=if your Subquery returns 5 rows?
10. MCQ Quiz with Answers
In a standard Subquery (like finding employees above the company average), in what order does MySQL execute the code?
What is the defining characteristic of a Correlated Subquery?
11. Interview Questions
- Q: Explain the functional and performance differences between a standard Subquery and a Correlated Subquery.
-
Q: In many scenarios, a Subquery using
INcan be rewritten as anINNER JOIN. From a database optimization perspective, which is generally preferred and why?
12. FAQs
Q: Can I put a Subquery inside theSELECT clause itself, instead of the WHERE clause?
A: Yes! You can use it to generate a calculated column on the fly:
SELECT name, (SELECT COUNT(*) FROM orders WHERE userid = u.id) AS total_orders FROM users u;