Skip to main content
Database Design
CHAPTER 07 Beginner

Primary and Foreign Keys | Database Architecture Constraints

Updated: May 16, 2026
20 min read

# CHAPTER 7

Primary Keys and Foreign Keys

1. Introduction

In Chapter 6, we drew lines on an ER Diagram connecting a User box to an Order box. But the database engine cannot see your diagram. It is a mathematical machine. To physically forge those connections in a way the database engine understands, we must use cryptographic-like anchors. These anchors are the absolute defining features of a Relational Database: the Primary Key and the Foreign Key. In this chapter, we will learn how to lock our tables together securely.

2. Learning Objectives

By the end of this chapter, you will be able to:
  • Define the mathematical strictness of a Primary Key.
  • Differentiate between Natural Keys and Surrogate Keys.
  • Architect Composite Primary Keys.
  • Define a Foreign Key constraint.
  • Understand and enforce Referential Integrity.

3. The Primary Key (The Anchor)

If a table has 50 users named "John Smith", how does the database know which one to update? Every single table in a relational database MUST have a Primary Key (PK). It is a column that acts as a mathematically guaranteed, globally unique fingerprint for that specific row.

The Absolute Rules of a Primary Key:

  1. 1. It must be UNIQUE. No two rows can ever share the same key.
  1. 2. It must be NOT NULL. A row cannot exist without a key.
  1. 3. It must never change.

4. Natural Keys vs. Surrogate Keys

Architects fiercely debate which type of Primary Key to use:
  • Natural Key: A unique attribute that already exists in the real world (e.g., a Social Security Number, an Email Address, or a Car VIN number).
  • Surrogate Key: A meaningless, computer-generated integer (e.g., id = 1, 2, 3... generated via AUTOINCREMENT).

The Industry Standard: Always use Surrogate Keys (AUTOINCREMENT). Why? Because real-world data changes. If you use an Email as a Primary Key, and the user changes their email, you have to update the Primary Key, which shatters all relationships across the database. Meaningless integers never change.

5. Composite Primary Keys

Sometimes, a single column isn't enough to uniquely identify a row. In a Many-to-Many Pivot Table (like studentclasses), an id column is useless. The true unique fingerprint is the *combination* of the studentid and the classid. A student can only enroll in a specific class once. When you combine two columns to act as a single Primary Key, it is called a Composite Primary Key.

6. The Foreign Key (The Bridge)

A Foreign Key (FK) is simply a column in Table B that holds the exact Primary Key of a row in Table A. It is the physical bridge that satisfies the line drawn on your ER Diagram.

If User #5 places an order, the orders table has a column named userid. We place the number 5 in that column. That column is the Foreign Key.

sql
1234567
CREATE TABLE orders (
    order_id INT AUTO_INCREMENT PRIMARY KEY,
    total_price DECIMAL(10,2),
    user_id INT,
    -- Architecting the physical bridge!
    FOREIGN KEY (user_id) REFERENCES users(id)
);

7. Referential Integrity (The Iron Wall)

Why do we explicitly write FOREIGN KEY ... REFERENCES? Why not just make an integer column and leave it alone? Because declaring a Foreign Key activates Referential Integrity.

Referential Integrity is a strict security system inside the database engine. It guarantees that a Foreign Key value MUST point to a physically existing Primary Key.

  • If a hacker tries to insert an order for user_id = 99, but User 99 does not exist in the users table, the database violently rejects the INSERT and throws a fatal error.
  • It prevents "Orphaned Records"—data that belongs to nothing, which corrupts accounting systems.

8. Mini Project: Cascading Actions

What happens if you delete User #5, but they have 10 orders in the orders table? Because of Referential Integrity, the database will block the deletion of the User to protect the orders! To automate this, architects use Cascading Constraints.
sql
1
FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE

*(Now, if User 5 is deleted, the database automatically hunts down and destroys all 10 of their orders instantly, keeping the database perfectly clean!).*

9. Common Mistakes

  • Data Type Mismatches: The single most common error when architecting relationships. If users.id is an INT, but your orders.userid column is defined as a VARCHAR or a BIGINT, the database engine will refuse to create the Foreign Key. The data types must be mathematically identical.

10. Best Practices

  • Index your Foreign Keys: Modern relational databases often index Primary Keys automatically, but they do NOT always index Foreign Keys automatically. Because you will constantly run JOIN queries using Foreign Keys, you should explicitly create Indexes (Chapter 14) on all FK columns to ensure massive performance.

11. Exercises

  1. 1. What are the two strict mathematical rules that define a valid Primary Key?
  1. 2. What database mechanism physically blocks you from inserting a child record that points to a non-existent parent record?

12. Database Design Challenges

You are designing a database for a library. A Book is identified by a unique ISBN string. However, you decide to create a separate book
id INT AUTO_INCREMENT column to serve as the Primary Key. Explain why you made this architectural choice using the terminology of Surrogate vs Natural keys. *(Answer: The ISBN is a Natural Key. While unique, strings are slow to search, and industry standards prefer Surrogate Keys (auto-incrementing integers) for Primary Keys because they are meaningless, never change, and provide blazing-fast JOIN performance).*

13. MCQ Quiz with Answers

Question 1

In the context of database architecture, what is the defining characteristic of a "Composite Primary Key"?

Question 2

When a Database Architect explicitly defines a FOREIGN KEY constraint between an Orders table and a Customers table, what critical security mechanism are they enforcing?

14. Interview Questions

  • Q: Explain the philosophical and architectural differences between Natural Keys and Surrogate Keys. Why has the industry overwhelmingly standardized on Surrogate Keys for Relational Database schema design?
  • Q: A junior developer attempts to execute a DELETE statement to remove a category from the database, but the DBMS throws a "Foreign Key Constraint Fails" error. Explain the concept of Referential Integrity causing this block, and how implementing ON DELETE CASCADE alters this behavior.

15. FAQs

Q: Can a table exist without a Primary Key? A: Technically, some database engines allow it. Architecturally, it is a catastrophic sin. A table without a primary key is just an unorganized pile of data. It cannot be joined securely, updated securely, or replicated. ALWAYS use a Primary Key.

16. Summary

You have locked the architecture in place. By mastering the mathematical anchors of Surrogate Primary Keys, wielding Composite Keys for complex pivots, and enforcing iron-clad Referential Integrity with Foreign Keys, you guarantee that your database is structurally sound and immune to relational corruption.

17. Next Chapter Recommendation

We know *how* to physically connect tables. Now we must master *where* to place the connections. Where does the Foreign Key actually go? In Chapter 8: One-to-One, One-to-Many, and Many-to-Many Relationships, we will translate ER Diagram cardinalities into precise schema designs.

Finish this Chapter

Save your progress on your learning path and prepare for coding interview challenges.

Discussion

Join the discussion

Log in or create a free account to participate.

Sort: ·