CHAPTER 06
Intermediate
Load Balancing
Updated: May 16, 2026
25 min read
# CHAPTER 6
Load Balancing
1. Introduction
In Chapter 3, we established that Horizontal Scaling (adding hundreds of identical servers) is the only way to achieve infinite scale. However, if you have 100 Web Servers, how does a user's browser know which specific server to talk to? If 99 servers are completely idle, but all the users happen to connect to Server 1, Server 1 will crash while the rest of the fleet does nothing. A Load Balancer is the critical architectural component that solves this chaos. It acts as the grand traffic cop of the internet. In this chapter, we will master Load Balancing. We will explore the routing algorithms that distribute traffic, understand the security benefits of Reverse Proxies, and engineer Failover systems to guarantee High Availability.2. Learning Objectives
By the end of this chapter, you will be able to:- Define the function and necessity of a Load Balancer in distributed systems.
- Explain common load balancing algorithms (Round Robin, Least Connections).
- Differentiate between a Load Balancer and a Reverse Proxy.
- Engineer "Health Checks" and automatic Failover architectures.
- Understand Layer 4 vs. Layer 7 load balancing.
3. What is a Load Balancer?
A Load Balancer is a specialized server (or hardware device) that sits directly between the internet and your fleet of application servers.-
The Concept: When a user types
www.example.com, their browser does not connect to your Web Server. It connects to the Load Balancer. The Load Balancer looks at the massive pool of available Web Servers, picks the one with the least amount of work, and secretly forwards the user's request to that specific server.
- The Magic: The user has no idea this is happening. They think they are talking to one giant computer.
4. Routing Algorithms
How does the Load Balancer decide which server gets the next request?- Round Robin: The simplest method. It just goes down the list sequentially. (Request 1 goes to Server A, Request 2 goes to Server B, Request 3 to Server C, Request 4 back to Server A).
- Least Connections: The load balancer monitors how many active connections each server currently has. It sends the next request to the server with the absolute lowest number of active connections. (Best for heavy, long-running requests).
- IP Hashing: It looks at the user's IP address, runs a math formula, and guarantees that the same user will *always* be routed to the exact same server. (Crucial if your architecture is incorrectly relying on "stateful" web servers).
5. Health Checks and Failover
A load balancer is not just a router; it is a paramedic.- Health Checks (The Pulse): Every 5 seconds, the Load Balancer sends a tiny "ping" request to all 100 Web Servers.
- The Failover: If Server #42 crashes and stops responding to the ping, the Load Balancer instantly marks Server #42 as "Dead." It immediately stops sending traffic to it. The users never see an error screen because their requests are instantly routed to the surviving 99 servers. This is the definition of High Availability.
6. Reverse Proxies (Security and SSL)
A Load Balancer often acts as a Reverse Proxy (e.g., Nginx, HAProxy).- The Shield: A Reverse Proxy hides the internal IP addresses of your Web Servers. Hackers can only attack the Load Balancer, protecting your internal network.
- SSL Termination: Encrypting and decrypting HTTPS traffic is heavily CPU intensive. Instead of making your 100 Web Servers waste CPU power decrypting traffic, the Load Balancer handles all the SSL decryption at the front door ("SSL Termination") and passes raw, unencrypted HTTP traffic internally to your safe web servers.
7. Diagrams/Visual Suggestions
*Architecture Diagram: Load Balancer Failover*
text
8. Best Practices
- Redundant Load Balancers: You added a Load Balancer to prevent your web servers from being a Single Point of Failure (SPOF). But now, the Load Balancer itself is a SPOF! If it crashes, the whole site goes down. *Best Practice:* Always deploy Load Balancers in an Active-Passive pair. If the Primary Load Balancer dies, the Secondary one instantly takes over its IP address within milliseconds.
9. Common Mistakes
- Layer 4 vs. Layer 7 Confusion:
- *Layer 4 (Transport):* Routes traffic based purely on IP addresses and ports. It is incredibly fast but blind to the data.
-
*Layer 7 (Application):* Reads the actual HTTP payload. *The Mistake:* Using Layer 4 when you need intelligent routing. If you want requests for
/videoto go to a specialized Video Server, and/chatto go to a Chat Server, you MUST use a Layer 7 Load Balancer because it can read the URL path.
10. Mini Project: Nginx Reverse Proxy Setup
Let's look at the actual configuration code for an Nginx Load Balancer.- 1. The Scenario: We have 3 web servers running a backend app.
-
2.
The Code (
nginx.conf):
nginx
- 3. The Result: Nginx will automatically intercept all traffic on port 80 and distribute it evenly across the 3 internal servers using a default Round Robin algorithm.
11. Practice Exercises
- 1. Define the role of a Load Balancer in a horizontally scaled system. How does it enable "High Availability"?
- 2. Explain the difference between the "Round Robin" and "Least Connections" routing algorithms. In what scenario would Round Robin fail to distribute load evenly?
12. MCQs with Answers
Question 1
A Load Balancer is constantly sending small "ping" requests to every Web Server in its fleet every 5 seconds. What is the architectural purpose of this action?
Question 2
To prevent web servers from wasting massive amounts of CPU power encrypting and decrypting secure HTTPS traffic, system architects often configure the Load Balancer to decrypt the traffic at the perimeter and send raw HTTP to the internal servers. What is this technique called?
13. Interview Questions
- Q: You horizontally scale your architecture to 50 web servers, but you place a single Load Balancer in front of them. What architectural vulnerability have you just created, and how do you resolve it? (Hint: Active-Passive redundancy).
- Q: Explain the difference between a Layer 4 Load Balancer and a Layer 7 Load Balancer. Why might an API Gateway strictly require Layer 7 capabilities?
- Q: Your web architecture relies entirely on "Stateful" sessions stored in the local RAM of individual web servers. If you implement a standard Round Robin load balancer, what catastrophic UX failure will occur when a user clicks to a new page, and how does the "IP Hashing" algorithm patch this bad design?