Skip to main content
SQL Fundamentals
CHAPTER 20 Beginner

Indexing and Query Optimization

Updated: May 16, 2026
15 min read

# CHAPTER 20

Indexing and Query Optimization

1. Introduction

When a database has 100 rows, every query is instantaneous. When a database hits 10,000,000 rows, a simple query like SELECT * FROM users WHERE email = 'john@gmail.com'; might take 5 full seconds. In the web development world, a 5-second delay is an eternity; users will close the app. To solve this, database architects utilize Indexes. In this chapter, we will learn how to transform sluggish, crashing queries into lightning-fast operations capable of handling enterprise scale.

2. Learning Objectives

By the end of this chapter, you will be able to:
  • Understand the mechanical problem of a "Full Table Scan".
  • Create an Index to massively accelerate WHERE searches.
  • Create Composite Indexes for multi-column queries.
  • Understand the performance trade-offs (The Cost of Indexing).
  • Use the EXPLAIN keyword to diagnose slow queries.

3. The Problem: The Full Table Scan

Imagine a massive library with 10 million books, completely unorganized. If I ask you to find the book "Dune", you must start at book #1 and check every single book until you find it. This is a Full Table Scan. It is exactly how SQL searches a database by default. It is incredibly slow and CPU intensive.

4. The Solution: The B-Tree Index

A Database Index is exactly like the Index at the back of a textbook. It is an alphabetically (or numerically) sorted list that points directly to the exact page the information is on. (Behind the scenes, SQL uses a complex mathematical structure called a B-Tree).

If we frequently search our E-Commerce site by product_category, we must explicitly build an Index on that column!

sql
123
-- DDL Command: Create an Index on the 'category' column
CREATE INDEX idx_product_category 
ON products (category);

*(Now, when a user queries WHERE category = 'Laptops', SQL bypasses the 10-million row table, instantly checks the B-Tree index, and finds the exact rows in 2 milliseconds!)*

5. Composite Indexes (Multi-Column)

If your users frequently search by two things simultaneously (e.g., "Find active users in New York"), a single index is not enough. You can create a Composite Index spanning multiple columns.
sql
1234567
-- Create an Index combining state and status!
CREATE INDEX idx_state_status 
ON users (state, status);

-- This query is now lightning fast!
SELECT * FROM users 
WHERE state = 'NY' AND status = 'Active';

6. The Cost of Indexing (Why not index everything?)

If Indexes make reading data instantly fast, why don't we just index every single column in the table? Because Indexes destroy WRITE speed.

Every time you execute an INSERT, UPDATE, or DELETE, the database not only has to write the row into the table, but it must mathematically recalculate and resort every single Index attached to that table. Furthermore, Indexes take up massive amounts of physical Hard Drive space. *Rule of Thumb:* Only index columns that are frequently used in WHERE, JOIN, or ORDER BY clauses.

7. Diagnosing Slow Queries (EXPLAIN)

How do you know if your query is slow because of a missing index, or because of a bad JOIN? You use the EXPLAIN keyword. If you put EXPLAIN in front of any query, SQL does *not* run the query. Instead, it gives you the Execution Plan—it tells you exactly how it *plans* to run the query.
sql
1
EXPLAIN SELECT * FROM users WHERE email = 'test@test.com';

*(If the EXPLAIN output says "Type: ALL" or "Seq Scan", it means it is doing a Full Table Scan! You need an index. If it says "Index Scan", your database is optimized!)*

8. Primary Keys and Foreign Keys are Auto-Indexed

You do not need to manually run CREATE INDEX on your id column. When you define a PRIMARY KEY or a UNIQUE constraint (Chapter 14), the database engine automatically generates a blazing-fast index for it! Modern databases often auto-index FOREIGN KEYS as well to speed up JOINs.

9. Common Mistakes

  • Indexing Low Cardinality Columns: "Cardinality" means uniqueness. An email column has High Cardinality (every row is different). A gender column has Low Cardinality (mostly just M/F). Creating an index on a Low Cardinality column is functionally useless, because the index still has to return 50% of the entire table.

10. Best Practices

  • Left-to-Right Composite Rule: When using a Composite Index (e.g., (state, city)), the order matters! The index can optimize WHERE state = 'NY', or WHERE state = 'NY' AND city = 'Albany'. But it CANNOT optimize WHERE city = 'Albany' alone. It only works left-to-right!

11. Exercises

  1. 1. What does the database engine perform when it searches a table that lacks an Index?
  1. 2. What DDL command is used to physically create an index on a specific column?

12. SQL Challenges

You are the DBA for a massive E-Commerce platform. The query SELECT * FROM orders WHERE userid = 99 AND orderstatus = 'Shipped'; is crashing the server because the orders table has 50 million rows. Write the SQL command to create a Composite Index named idxuserstatus to fix this query instantly.
sql
12
CREATE INDEX idx_user_status
ON orders (user_id, order_status);

13. MCQ Quiz with Answers

Question 1

Why is it considered a severe architectural anti-pattern to create an Index on every single column in a massive database table?

Question 2

When prefixing a query with the EXPLAIN keyword (e.g., EXPLAIN SELECT * FROM users), what output does the database engine provide?

14. Interview Questions

  • Q: Explain the mechanical structure of a B-Tree Index. Contrast the performance of a query executing an "Index Scan" versus a query executing a "Sequential Full Table Scan".
  • Q: A developer creates a composite index on (lastname, firstname). They then run the query SELECT * FROM users WHERE first_name = 'John';. Explain why the database engine is unable to utilize the index for this query.

15. FAQs

Q: Do I need to update the Index when I insert new data? A: No! This is the magic of Relational Databases. The database engine automatically recalculates and reorganizes the B-Tree Index invisibly in the background every single time a row is inserted, updated, or deleted.

16. Summary

You are now an Optimization Architect. By diagnosing execution plans with EXPLAIN and strategically implementing B-Tree and Composite Indexes, you can transform an application that buckles under the weight of 10 million rows into a hyper-efficient system that processes data in milliseconds.

17. Next Chapter Recommendation

We have mastered Creating (INSERT) and Reading (SELECT). But what if a user moves to a new city and needs to update their address? Or what if the HR department gives everyone a 10% raise? In Chapter 21: UPDATE Queries and Modifying Data, we will return to DML and learn how to safely modify existing rows.

Finish this Chapter

Save your progress on your learning path and prepare for coding interview challenges.

Discussion

Join the discussion

Log in or create a free account to participate.

Sort: ·