Skip to main content
PostgreSQL
CHAPTER 18 Intermediate

Indexing and Query Optimization

Updated: May 16, 2026
7 min read

# CHAPTER 18

Indexing and Query Optimization

1. Introduction

If you are looking for the word "PostgreSQL" in a 1,000-page textbook, you don't start at page 1 and read every single word. You go to the Index at the back of the book, find the "P", look up the page number, and instantly flip to it. Databases work exactly the same way. If an application is running slowly, 99% of the time it is because the database is missing an Index. In this chapter, we will master Query Optimization to transform queries that take 5 seconds into queries that take 5 milliseconds.

2. Learning Objectives

By the end of this chapter, you will be able to:
  • Understand the catastrophic performance of a "Full Table Scan".
  • Create standard B-Tree Indexes using CREATE INDEX.
  • Identify the correct columns to index (Foreign Keys, search fields).
  • Create Composite Indexes for multi-column queries.
  • Use EXPLAIN ANALYZE to diagnose slow queries.
  • Understand GIN indexes for JSONB data.

3. The Problem: Full Table Scans

Imagine a users table with 5 million rows. A user tries to log in. The backend executes: SELECT * FROM users WHERE email = 'john@example.com';

Because the database doesn't know where John is, it must start at row 1 and read every single row until it finds him. If he is row 4,999,999, this is called a Full Table Scan. It is incredibly slow, burns massive CPU power, and will cause your application to crash under heavy traffic.

4. Creating a B-Tree Index

To fix this, we instruct PostgreSQL to create an Index on the email column.
sql
12
-- This builds a hidden, alphabetically sorted tree structure in the background!
CREATE INDEX idx_users_email ON users(email);

How it works: PostgreSQL creates a highly optimized data structure called a B-Tree (Balanced Tree). Now, when you search for "john", PostgreSQL jumps straight to the "J" section of the tree, instantly grabs the exact hard-drive location of the row, and fetches it in 0.001 seconds.

5. What Columns Should You Index?

You do not index every column. Indexes consume hard drive space, and they slightly slow down INSERT commands (because the tree must be re-sorted every time new data is added). Only Index:
  1. 1. Columns frequently used in WHERE clauses (like emails, usernames).
  1. 2. Foreign Keys! (If orders has a user_id, index it! This makes JOINs lightning fast).
  1. 3. Columns frequently used in ORDER BY.

*(Note: PostgreSQL automatically creates an index on every PRIMARY KEY and UNIQUE column).*

6. Composite Indexes

If a query frequently searches using two columns together, you can create a Composite Index.
sql
1234
-- Query: SELECT * FROM employees WHERE last_name = 'Smith' AND first_name = 'John';

-- Solution: Create an index combining both!
CREATE INDEX idx_emp_name ON employees(last_name, first_name);

7. Diagnosing Queries: EXPLAIN ANALYZE

How do you know if your query is doing a Full Table Scan? You ask PostgreSQL to explain its mechanical thought process using EXPLAIN ANALYZE.
sql
1
EXPLAIN ANALYZE SELECT * FROM users WHERE email = 'john@example.com';

The output will show you the exact "Query Plan".

  • If it says Seq Scan (Sequential Scan), it is doing a slow Full Table Scan.
  • If it says Index Scan, congratulations! Your index is working perfectly.

Standard B-Tree indexes are great for exact matches (like email or IDs). But what if you are using PostgreSQL's advanced JSONB data type, and you want to search for a specific key hidden deep inside a massive JSON document? A B-Tree cannot look inside a JSON object. You must use a GIN Index (Generalized Inverted Index).
sql
12
-- Indexing a JSONB column so we can instantly search its internal keys!
CREATE INDEX idx_user_settings ON users USING GIN (settings);

9. Common Mistakes

  • Over-Indexing: Creating an index on a boolean column like isactive (which only has two possible values: True or False) is a massive waste. An index is only useful if it narrows down the results significantly (called "high cardinality").
  • Leading Wildcards destroy Indexes: As discussed in Chapter 8, running WHERE email ILIKE '%john%' completely bypasses the B-Tree index and forces a Sequential Scan. An index can only help if the search is exact, or starts at the beginning of the string ('john%').

10. Best Practices

  • Drop Unused Indexes: Periodically review your indexes. If an index is no longer used by the application, drop it. It is consuming RAM, Disk Space, and slowing down INSERT operations.

11. Exercises

  1. 1. Write the SQL command to create an index on the productname column in the products table.
  1. 2. Why is it unnecessary to write CREATE INDEX for a column defined as a PRIMARY KEY?

12. SQL Challenges

You notice that SELECT * FROM orders WHERE user_id = 45 AND status = 'Pending' is running very slowly. Write the DDL command to create the optimal Composite Index to fix this query.
sql
1
CREATE INDEX idx_orders_user_status ON orders(user_id, status);

13. MCQ Quiz with Answers

Question 1

What is the primary negative side-effect of creating too many Indexes on a table?

Question 2

What SQL command do you prepend to a SELECT query to view the underlying execution plan (e.g., to see if it is performing a Seq Scan or an Index Scan)?

14. Interview Questions

  • Q: Explain the mechanical difference between a Sequential Scan (Full Table Scan) and an Index Scan in PostgreSQL.
  • Q: If you have an index on (lastname, firstname), will that index be utilized by the database if a query only searches WHERE firstname = 'John'? Why or why not?

15. FAQs

Q: How does a Composite Index work? A: Think of it like a phone book. A phone book is indexed by (last
name, first_name). It is incredibly fast to find all the "Smiths", and then find the "Johns" within the Smiths. However, if you just try to look up "John", the phone book is useless because it is sorted by last name first!

16. Summary

Indexing is the black magic of database engineering. By strategically applying B-Tree indexes to your Foreign Keys and heavily queried search columns, you prevent catastrophic Sequential Scans. Using EXPLAIN ANALYZE, you can confidently verify that your application will remain blisteringly fast even as it scales to billions of rows.

17. Next Chapter Recommendation

Our queries are optimized and blazing fast. But what happens if a bug in the PHP application tries to insert a negative number for a product's price? An index won't stop that. In Chapter 19: Constraints and Data Integrity, we will learn how to build defensive walls that mathematically block bad data from entering the system.

Finish this Chapter

Save your progress on your learning path and prepare for coding interview challenges.

Discussion

Join the discussion

Log in or create a free account to participate.

Sort: ·