Skip to main content
API Security Tutorial
CHAPTER 13 Intermediate

API Rate Limiting and Throttling

Updated: May 13, 2026
15 min read

# CHAPTER 13

API Rate Limiting and Throttling

1. Introduction

If you build a robust, secure API, you want people to use it. But what happens if a malicious bot—or just a poorly written script—attempts to use it 100,000 times a minute? Without protections, your server's CPU will max out, your database will crash, and legitimate users will experience a Denial of Service (DoS). In this chapter, we will learn how to implement Rate Limiting and Throttling to prevent API abuse, stop brute-force attacks, and ensure fair usage of your server's resources.

2. Learning Objectives

By the end of this chapter, you will be able to:
  • Explain the purpose of Rate Limiting in an API ecosystem.
  • Differentiate between a Distributed Denial of Service (DDoS) attack and application-layer exhaustion.
  • Understand common rate limiting algorithms (Token Bucket, Fixed Window).
  • Implement a basic IP-based rate limiter in PHP using Redis or a Database.
  • Utilize standard HTTP headers (X-RateLimit-Limit) to communicate limits to clients.

3. Beginner-Friendly Explanation

Imagine an all-you-can-eat buffet. The restaurant wants to feed everyone, but they only have one chef. If one customer runs to the kitchen window and demands 500 steaks immediately, the chef is overwhelmed, the kitchen catches fire, and none of the other customers get to eat.

Rate Limiting is the restaurant manager standing at the window saying: "Every customer is allowed to order a maximum of 3 items per minute. If you try to order a 4th item, you must go sit down and wait for 60 seconds before you can order again." This ensures the chef can cook at a steady pace and every customer gets fair service.

4. Real-World Attack Scenarios

  • Brute Force / Credential Stuffing: An attacker finds your /api/login endpoint. They use a script to try 10,000 different passwords per minute against the admin email address. If you have no rate limiting, they will eventually guess the password. With a rate limit of 5 requests per minute, the attack becomes mathematically impossible to execute in a human lifetime.
  • Resource Exhaustion (Application DoS): An API has a heavy endpoint: GET /api/generatepdfreport. Generating a PDF takes 2 seconds of heavy CPU time. A competitor writes a script to hit this endpoint 50 times a second. The server's CPU hits 100% and crashes the entire application.

5. Standard Rate Limit Headers

When you rate limit a user, it is polite (and standard practice) to tell them their limits via HTTP headers so their application can slow down automatically.
  • X-RateLimit-Limit: The total number of requests allowed in a time window (e.g., 100).
  • X-RateLimit-Remaining: How many requests they have left in the current window (e.g., 99).
  • X-RateLimit-Reset: The Unix timestamp of when their limit will reset and they can make requests again.

If the user exceeds the limit, the API should return a 429 Too Many Requests status code.

6. Vulnerable vs Secure Code Examples

Vulnerable PHP (No Limits):

php
12345
<?php
// VULNERABLE: The server processes everything immediately
generate_heavy_pdf_report($_GET[&#039;user_id']);
echo "Report generated!";
?>

Secure PHP (Basic Fixed Window Rate Limiting using a DB/Cache): *Note: In production, Redis or Memcached is used for this because they are incredibly fast. We use a conceptual structure here.*

php
123456789101112131415161718192021222324
<?php
$user_ip = $_SERVER[&#039;REMOTE_ADDR'];
$max_requests = 60; // 60 requests per minute
$window_time = 60;  // 1 minute in seconds

// Conceptual function: Checks a cache/database to see how many times this IP requested this minute
$current_requests = get_request_count_from_cache($user_ip, $window_time);

if ($current_requests >= $max_requests) {
    // Limit Exceeded
    http_response_code(429); // Too Many Requests
    header(&#039;Retry-After: 60'); // Tell client to wait 60 seconds
    echo json_encode(["error" => "Rate limit exceeded. Please try again later."]);
    exit;
} else {
    // Increment the counter for this IP in the cache
    increment_request_count_in_cache($user_ip, $window_time);
    
    // Proceed with API logic
    header(&#039;X-RateLimit-Limit: ' . $max_requests);
    header(&#039;X-RateLimit-Remaining: ' . ($max_requests - $current_requests - 1));
    echo json_encode(["data" => "Success!"]);
}
?>

7. Rate Limiting Strategies

  • IP-Based Limiting: Limiting by the user's IP address. *Pros:* Easy to implement, good for anonymous APIs. *Cons:* If 100 students are on a university Wi-Fi network, they share one IP address. One student can exhaust the limit for the entire campus.
  • User/Token-Based Limiting: Limiting based on the logged-in user's JWT or API Key. *Pros:* Extremely accurate, allows you to offer "Premium" tiers (e.g., Free users get 10 req/min, Paid users get 1000 req/min).

8. Throttling vs Rate Limiting

  • Rate Limiting: A hard stop. "You hit 100 requests. You get a 429 error. Stop."
  • Throttling: A soft slowdown. "You hit 100 requests. We will still process your 101st request, but we will intentionally delay it by 2 seconds so you don't overwhelm us."

9. Best Practices

  • Use Redis: Do not use MySQL to track rate limits for a high-traffic API. Writing a row to a database 1,000 times a second will cause the database itself to crash. Use an in-memory datastore like Redis.
  • Tiered Limits: Apply different limits to different endpoints. /api/status can be 1000/minute. /api/login should be 5/minute. /api/exportheavycsv should be 1/minute.
  • Implement at the Gateway Layer: While you can write rate limits in PHP, it is much more efficient to let a Web Application Firewall (WAF), Cloudflare, Nginx, or an API Gateway (like AWS API Gateway) handle the blocking *before* the request even touches your PHP code.

10. Common Mistakes

  • Leaking Rate Limit Mechanics: Returning error messages that help attackers tune their bots. If you say "Limit of 5 requests per 10 seconds reached", the attacker will just program their bot to send exactly 4 requests every 10 seconds to stay under the radar.
  • Trusting the X-Forwarded-For header blindly: If you limit by IP, attackers can easily fake the X-Forwarded-For header to make it look like requests are coming from different IPs, bypassing the limit. Only trust this header if it is securely appended by your own Load Balancer.

11. Mini Exercises

  1. 1. What HTTP Status Code signifies "Too Many Requests"?
  1. 2. Why is it dangerous to rely entirely on IP addresses for rate limiting logged-in users?

12. Practice Challenges

Challenge: Design a Rate Limiting policy for a new E-Commerce API. Define the limits (requests per minute) for the following endpoints, and justify your reasoning:
  1. 1. GET /products (Browsing the catalog)
  1. 2. POST /login (User authentication)
  1. 3. POST /checkout (Processing a credit card)

13. MCQs with Answers

Question 1

What is the primary purpose of API Rate Limiting?

Question 2

Which HTTP Status code is universally used to inform a client that they have hit a rate limit?

Question 3

If you want to offer different API quotas to Free users vs. Enterprise users, which rate limiting strategy MUST you use?

14. Interview Questions

  • Q: Explain the difference between Rate Limiting and Throttling.
  • Q: You are building an API endpoint that generates heavy PDF reports. An attacker is trying to perform an Application-Layer DoS by hitting it repeatedly. Walk me through your strategy to protect the server.
  • Q: Why is Redis generally preferred over MySQL for tracking rate limit counters in high-traffic APIs?

15. FAQs

Q: How do massive DDoS attacks bypass rate limiting? A: A volumetric DDoS attack involves millions of compromised computers (a botnet) sending traffic simultaneously. Even if you limit each IP to 1 request per minute, 1 million requests hitting your server at the same second will crash your Apache/Nginx web server before the PHP rate limit code even executes. This requires infrastructure-level protection (like Cloudflare).

16. Summary

In this chapter, we learned that allowing infinite access to an API is a recipe for disaster. We discussed how Rate Limiting and Throttling act as vital defense mechanisms against brute-force attacks and resource exhaustion. We explored different strategies, from simple IP-based limits to granular Token-based quotas, and reviewed the standard HTTP headers and status codes (429) used to communicate these limits to clients efficiently.

17. Next Chapter Recommendation

We have secured our text and JSON data, but what happens when a user wants to upload a profile picture? File uploads are one of the most dangerous features in web development. Proceed to Chapter 14: Secure File Upload APIs to learn how to handle them safely.

Finish this Chapter

Save your progress on your learning path and prepare for coding interview challenges.

Discussion

Join the discussion

Log in or create a free account to participate.

Sort: ·