Skip to main content
PostgreSQL
CHAPTER 15 Intermediate

Advanced JOIN Queries and Set Operations

Updated: May 16, 2026
6 min read

# CHAPTER 15

Advanced JOIN Queries and Set Operations

1. Introduction

In the previous chapter, we learned how to link a Parent table to a Child table. But database architecture can get strange. What if the Parent and the Child are in the exact same table? What if you need to compare data from a customers table with data from a totally unrelated suppliers table? In this chapter, we will break out of standard relational modeling and explore Advanced JOINs and Set Operations to solve complex edge-case reporting requirements.

2. Learning Objectives

By the end of this chapter, you will be able to:
  • Use a SELF JOIN to link a table to itself.
  • Understand the algorithmic explosion of a CROSS JOIN.
  • Vertically stack query results using the UNION operator.
  • Find overlapping datasets using INTERSECT.
  • Subtract datasets using EXCEPT.

3. The SELF JOIN

A Self Join is used when a table has a Foreign Key that points to its own Primary Key. The Classic Example: An employees table. Every employee has a manager_id. But the manager is ALSO an employee in the exact same table!

To join a table to itself, you MUST use Table Aliases to trick PostgreSQL into thinking it is looking at two different tables.

sql
12345678
-- We open the employees table twice! 
-- Once as 'e' (the worker), and once as 'm' (the manager).
SELECT 
    e.first_name AS "Employee", 
    m.first_name AS "Manager"
FROM employees e
LEFT JOIN employees m 
    ON e.manager_id = m.emp_id;

*(We use LEFT JOIN because the CEO does not have a manager. An INNER JOIN would exclude the CEO from the report!)*

4. The CROSS JOIN (Cartesian Product)

A CROSS JOIN matches EVERY row in Table A with EVERY row in Table B. It does not use an ON clause. If Table A has 10 colors, and Table B has 10 shirt sizes, a CROSS JOIN generates all 100 possible combinations.
sql
123
SELECT c.color_name, s.size_name
FROM colors c
CROSS JOIN sizes s;

DANGER: Do not run a CROSS JOIN on large tables. If you cross-join a 10,000-row table with another 10,000-row table, PostgreSQL will attempt to generate 100,000,000 rows and your server will crash.

5. Set Operations: UNION

While JOINs combine data *horizontally* (adding columns side-by-side), Set Operations combine data *vertically* (stacking rows on top of each other). The UNION operator takes the results of two completely separate SELECT queries and stacks them into one list.

The Rules of UNION:

  1. 1. Both queries must have the exact same number of columns.
  1. 2. The columns must have compatible data types.

sql
1234
-- The company wants a master mailing list of ALL humans they interact with.
SELECT first_name, email, 'Customer' AS role FROM customers
UNION
SELECT first_name, email, 'Supplier' AS role FROM suppliers;

*(By default, UNION automatically removes exact duplicates. If you want to keep duplicates, use UNION ALL, which is also much faster).*

6. Set Operations: INTERSECT

INTERSECT stacks two queries, but ONLY returns the rows that appear in BOTH queries.
sql
1234
-- Find users who have BOTH a registered account AND are signed up for the newsletter
SELECT email FROM registered_users
INTERSECT
SELECT email FROM newsletter_subscribers;

7. Set Operations: EXCEPT

EXCEPT (known as MINUS in Oracle) subtracts the results of the second query from the first query.
sql
1234
-- Find registered users who have NEVER signed up for the newsletter
SELECT email FROM registered_users
EXCEPT
SELECT email FROM newsletter_subscribers;

8. Mini Project: The Corporate Hierarchy Report

Using a Self Join, generate a report that shows the full chain of command for the IT department.
sql
12345678910
SELECT 
    worker.first_name AS "Staff Member",
    manager.first_name AS "Direct Supervisor",
    director.first_name AS "Department Head"
FROM employees worker
LEFT JOIN employees manager 
    ON worker.manager_id = manager.emp_id
LEFT JOIN employees director 
    ON manager.manager_id = director.emp_id
WHERE worker.department = 'IT';

9. Common Mistakes

  • UNION Column Mismatches: If Query A selects (name, email) and Query B selects (email, name), the UNION will execute, but the data will be hopelessly scrambled in the final output. The columns must align perfectly in the exact same order!
  • Accidental Cross Joins: If you write SELECT * FROM users, orders; (using a comma instead of a JOIN clause), PostgreSQL executes a massive CROSS JOIN. Never use comma-separated FROM clauses.

10. Best Practices

  • Use UNION ALL when possible: Because standard UNION scans the final massive list to remove duplicates, it is heavily CPU-intensive. If you know mathematically that the two queries cannot have duplicates (or if you don't care), always use UNION ALL for a massive performance boost.

11. Exercises

  1. 1. Write a UNION ALL query that combines productname from the activeproducts table and the archived_products table.
  1. 2. What is the fundamental difference between how a JOIN combines data and how a UNION combines data?

12. SQL Challenges

Using a Self Join on an employees table, write a query to find all employees who earn a higher salary than their own direct manager.
sql
1234
SELECT e.first_name, e.salary, m.first_name AS manager, m.salary AS manager_salary
FROM employees e
INNER JOIN employees m ON e.manager_id = m.emp_id
WHERE e.salary > m.salary;

13. MCQ Quiz with Answers

Question 1

What is the mandatory requirement for two SELECT queries to be successfully stacked using the UNION operator?

Question 2

Which Set Operator would you use to find the exact overlap between two different datasets (i.e., returning only the emails that exist in BOTH tables)?

14. Interview Questions

  • Q: Explain how a SELF JOIN works under the hood. Why are Table Aliases strictly mandatory when performing one?
  • Q: Describe a business use-case where executing a CROSS JOIN is actually the correct architectural choice, rather than a catastrophic mistake.

15. FAQs

Q: Can I use ORDER BY with a UNION? A: Yes, but you only write the ORDER BY clause ONE time, at the very end of the final query. It will sort the entire stacked result set.

16. Summary

You have transcended standard relational limits. By mastering SELF JOINS, you can traverse recursive organizational charts. By utilizing UNION, INTERSECT, and EXCEPT, you can manipulate entirely separate datasets as if they were mathematical sets, stacking, overlapping, and subtracting data at will.

17. Next Chapter Recommendation

We have mastered combining tables. But what if the answer to your WHERE clause condition requires another entirely separate query to figure out? In Chapter 16: Subqueries and Common Table Expressions (CTEs), we will learn how to nest SQL queries inside of other SQL queries.

Finish this Chapter

Save your progress on your learning path and prepare for coding interview challenges.

Discussion

Join the discussion

Log in or create a free account to participate.

Sort: ·