CHAPTER 05
Intermediate
Caching Strategies
Updated: May 16, 2026
25 min read
# CHAPTER 5
Caching Strategies
1. Introduction
The slowest part of any modern web application is the physical hard drive of the database. If a user requests their profile data, the web server must query the database, the database must search millions of rows on a spinning disk, format the data, and send it back. This takes hundreds of milliseconds. If 100,000 users request the exact same viral news article, forcing the database to execute the exact same slow query 100,000 times is architectural suicide. Caching is the ultimate performance cheat code. A cache is a lightning-fast, temporary storage layer sitting between the server and the database. In this chapter, we will master Caching Strategies. We will explore In-Memory caches like Redis, understand the nightmare of Cache Invalidation, and leverage CDNs to cache assets globally.2. Learning Objectives
By the end of this chapter, you will be able to:- Define "Caching" and its role in reducing database load and latency.
- Differentiate between Database, Application (Redis), and Edge (CDN) caching.
- Understand standard caching patterns (Cache-Aside, Write-Through).
- Architect a distributed Redis cluster for a high-traffic application.
- Diagnose and solve the complex problem of Cache Invalidation.
3. What is Caching?
A cache leverages the speed of Random Access Memory (RAM).- The Concept: Reading data from RAM is exponentially faster (often microseconds) than reading data from a physical SSD or Hard Drive (milliseconds).
- The Workflow:
- 1. The Web Server needs data (e.g., "Top 10 Trending Movies").
- 2. The server checks the Cache (RAM) first.
- 3. Cache Hit: If the data is there, it is returned instantly.
- 4. Cache Miss: If the data is missing, the server queries the slow Database. The server then takes that data, saves a copy into the Cache, and returns it to the user. The next user will get a Cache Hit.
4. Distributed Caches (Redis and Memcached)
You cannot just store cache data inside the Web Server's local RAM. If you have 50 Web Servers, they would all have different caches.- The Distributed Cache: You deploy a dedicated cluster of servers running an In-Memory data store like Redis or Memcached.
-
All 50 Web Servers connect over the network to the central Redis cluster. Redis stores data purely as simple Key-Value pairs (e.g.,
userid5 : {"name": "John"}).
5. Content Delivery Networks (Edge Caching)
Caching isn't just for database queries; it's critical for large visual assets (Images, Videos, CSS).- The CDN: A Content Delivery Network (like Cloudflare or AWS CloudFront) is a global network of caching servers placed physically close to users around the world (at the "Edge").
- The Benefit: If your main server is in New York, a user in Tokyo downloading a 5MB image will experience high latency. A CDN caches a copy of that image on a server located right in Tokyo, ensuring the image loads instantly for Japanese users while completely protecting your New York server from the traffic.
6. Cache Eviction and Invalidation (The Hard Part)
Phil Karlton famously said: *"There are only two hard things in Computer Science: cache invalidation and naming things."*- RAM is incredibly expensive. You cannot cache everything.
- Cache Eviction Policies: When the Redis RAM is 100% full, it must delete old data to make room for new data. The industry standard is LRU (Least Recently Used), which deletes the data that hasn't been requested in the longest time.
- Cache Invalidation: If you cache a user's profile, and they update their profile picture, the Database is now updated, but the Cache is holding the old, "stale" picture. You MUST write logic in your application to instantly delete or overwrite the specific cache key whenever the underlying database is updated.
7. Diagrams/Visual Suggestions
*Architecture Diagram: The Cache-Aside Pattern*
text
8. Best Practices
- Time-to-Live (TTL): Never put data into a cache forever. Always attach a TTL. For example, you cache the "Live Sports Score" with a TTL of 10 seconds. After 10 seconds, Redis automatically deletes it. The next user request causes a "Cache Miss," forcing the server to fetch the updated score from the database. This guarantees your data never gets permanently stale.
9. Common Mistakes
- The Thundering Herd Problem: Imagine caching a highly viral celebrity tweet with a TTL of 60 seconds. At second 61, the cache expires. Exactly at that millisecond, 50,000 users request the tweet. Because the cache is empty, all 50,000 requests bypass the cache and hit the database simultaneously, instantly crashing it. *The Fix:* Use advanced techniques like "Mutex Locks" to ensure that if a cache expires, only *one* request is allowed to hit the database to fetch and re-cache the data, while the other 49,999 requests wait a few milliseconds.
10. Mini Project: Architect a News Feed Cache
Let's design the architecture for a viral news site.- 1. The Assets: We place Cloudflare (CDN) in front of the website to cache all the heavy images and CSS globally.
-
2.
The Homepage (Redis): The "Top 10 Headlines" query takes 5 seconds to run on the Database. We write a script that runs every 60 seconds, queries the DB, and overwrites a key in Redis called
homepage_headlines.
- 3. The Traffic: When 1 million users hit the homepage, the Web Servers fetch the data from Redis in 2 milliseconds. The SQL database sits at 1% CPU utilization.
11. Practice Exercises
- 1. Define the "Cache-Aside" pattern. Describe the exact step-by-step logic a web server follows when attempting to retrieve a user's profile data.
- 2. Explain the purpose of a Content Delivery Network (CDN). How does it solve the problem of physical geographic latency for massive media files?
12. MCQs with Answers
Question 1
A major news website's SQL database is crashing because 500,000 users are simultaneously loading the homepage and triggering the exact same complex "Latest Articles" SQL query. What is the standard architectural solution to resolve this specific database bottleneck?
Question 2
What does "Cache Invalidation" refer to in system design?
13. Interview Questions
- Q: Walk me through the "Least Recently Used" (LRU) cache eviction policy. Why is it necessary, and how does it ensure the cache operates efficiently when physical RAM is 100% full?
- Q: Explain the concept of a Time-To-Live (TTL) attribute on a cache key. Give a real-world example of data that should have a 5-second TTL versus data that should have a 30-day TTL.
- Q: A user uploads a new 5MB profile picture, but when they refresh the page, they still see their old picture. Diagnosing this system, explain how the CDN and Cache Invalidation likely failed in this architecture.