CHAPTER 06 Intermediate

Load Balancing

Updated: May 16, 2026

25 min read

# CHAPTER 6

Load Balancing

1. Introduction

In Chapter 3, we established that Horizontal Scaling (adding hundreds of identical servers) is the only way to achieve infinite scale. However, if you have 100 Web Servers, how does a user's browser know which specific server to talk to? If 99 servers are completely idle, but all the users happen to connect to Server 1, Server 1 will crash while the rest of the fleet does nothing. A Load Balancer is the critical architectural component that solves this chaos. It acts as the grand traffic cop of the internet. In this chapter, we will master Load Balancing. We will explore the routing algorithms that distribute traffic, understand the security benefits of Reverse Proxies, and engineer Failover systems to guarantee High Availability.

2. Learning Objectives

By the end of this chapter, you will be able to:

Define the function and necessity of a Load Balancer in distributed systems.

Explain common load balancing algorithms (Round Robin, Least Connections).

Differentiate between a Load Balancer and a Reverse Proxy.

Engineer "Health Checks" and automatic Failover architectures.

Understand Layer 4 vs. Layer 7 load balancing.

3. What is a Load Balancer?

A Load Balancer is a specialized server (or hardware device) that sits directly between the internet and your fleet of application servers.

The Concept: When a user types www.example.com, their browser does not connect to your Web Server. It connects to the Load Balancer. The Load Balancer looks at the massive pool of available Web Servers, picks the one with the least amount of work, and secretly forwards the user's request to that specific server.

The Magic: The user has no idea this is happening. They think they are talking to one giant computer.

4. Routing Algorithms

How does the Load Balancer decide which server gets the next request?

Round Robin: The simplest method. It just goes down the list sequentially. (Request 1 goes to Server A, Request 2 goes to Server B, Request 3 to Server C, Request 4 back to Server A).

Least Connections: The load balancer monitors how many active connections each server currently has. It sends the next request to the server with the absolute lowest number of active connections. (Best for heavy, long-running requests).

IP Hashing: It looks at the user's IP address, runs a math formula, and guarantees that the same user will *always* be routed to the exact same server. (Crucial if your architecture is incorrectly relying on "stateful" web servers).

5. Health Checks and Failover

A load balancer is not just a router; it is a paramedic.

Health Checks (The Pulse): Every 5 seconds, the Load Balancer sends a tiny "ping" request to all 100 Web Servers.

The Failover: If Server #42 crashes and stops responding to the ping, the Load Balancer instantly marks Server #42 as "Dead." It immediately stops sending traffic to it. The users never see an error screen because their requests are instantly routed to the surviving 99 servers. This is the definition of High Availability.

6. Reverse Proxies (Security and SSL)

A Load Balancer often acts as a Reverse Proxy (e.g., Nginx, HAProxy).

The Shield: A Reverse Proxy hides the internal IP addresses of your Web Servers. Hackers can only attack the Load Balancer, protecting your internal network.

SSL Termination: Encrypting and decrypting HTTPS traffic is heavily CPU intensive. Instead of making your 100 Web Servers waste CPU power decrypting traffic, the Load Balancer handles all the SSL decryption at the front door ("SSL Termination") and passes raw, unencrypted HTTP traffic internally to your safe web servers.

7. Diagrams/Visual Suggestions

*Architecture Diagram: Load Balancer Failover*

text

1234567891011

                  [ Internet ]
                       |
               [ Load Balancer ] (Checks Health)
               /       |        \
             /         |          \
           /           |            \
[ Server A ]      [ Server B ]      [ Server C ]
  (Healthy)        (CRASHED)         (Healthy)
      ^                                  ^
      |                                  |
Traffic routed here                Traffic routed here

8. Best Practices

Redundant Load Balancers: You added a Load Balancer to prevent your web servers from being a Single Point of Failure (SPOF). But now, the Load Balancer itself is a SPOF! If it crashes, the whole site goes down. *Best Practice:* Always deploy Load Balancers in an Active-Passive pair. If the Primary Load Balancer dies, the Secondary one instantly takes over its IP address within milliseconds.

9. Common Mistakes

Layer 4 vs. Layer 7 Confusion:

*Layer 4 (Transport):* Routes traffic based purely on IP addresses and ports. It is incredibly fast but blind to the data.

*Layer 7 (Application):* Reads the actual HTTP payload. *The Mistake:* Using Layer 4 when you need intelligent routing. If you want requests for /video to go to a specialized Video Server, and /chat to go to a Chat Server, you MUST use a Layer 7 Load Balancer because it can read the URL path.

10. Mini Project: Nginx Reverse Proxy Setup

Let's look at the actual configuration code for an Nginx Load Balancer.

1. The Scenario: We have 3 web servers running a backend app.

2. The Code (nginx.conf):

nginx

123456789101112

upstream my_backend {
    server 10.0.0.1:8080;
    server 10.0.0.2:8080;
    server 10.0.0.3:8080;
}

server {
    listen 80;
    location / {
        proxy_pass http://my_backend;
    }
}

3. The Result: Nginx will automatically intercept all traffic on port 80 and distribute it evenly across the 3 internal servers using a default Round Robin algorithm.

11. Practice Exercises

1. Define the role of a Load Balancer in a horizontally scaled system. How does it enable "High Availability"?

2. Explain the difference between the "Round Robin" and "Least Connections" routing algorithms. In what scenario would Round Robin fail to distribute load evenly?

12. MCQs with Answers

Question 1

A Load Balancer is constantly sending small "ping" requests to every Web Server in its fleet every 5 seconds. What is the architectural purpose of this action?

Question 2

To prevent web servers from wasting massive amounts of CPU power encrypting and decrypting secure HTTPS traffic, system architects often configure the Load Balancer to decrypt the traffic at the perimeter and send raw HTTP to the internal servers. What is this technique called?

13. Interview Questions

Q: You horizontally scale your architecture to 50 web servers, but you place a single Load Balancer in front of them. What architectural vulnerability have you just created, and how do you resolve it? (Hint: Active-Passive redundancy).

Q: Explain the difference between a Layer 4 Load Balancer and a Layer 7 Load Balancer. Why might an API Gateway strictly require Layer 7 capabilities?

Q: Your web architecture relies entirely on "Stateful" sessions stored in the local RAM of individual web servers. If you implement a standard Round Robin load balancer, what catastrophic UX failure will occur when a user clicks to a new page, and how does the "IP Hashing" algorithm patch this bad design?

14. FAQs

Q: Do I need to build my own Load Balancer? A: Rarely. While you can install Nginx or HAProxy on a Linux box and configure it manually, almost all modern startups use managed Cloud Load Balancers (like AWS Application Load Balancer or Google Cloud Load Balancing). The cloud provider handles the extreme complexity, auto-scaling, and active-passive redundancy for you.

15. Summary

In Chapter 6, we brought order to the chaos of horizontal scale. We deployed Load Balancers as the intelligent traffic cops of our architecture, utilizing routing algorithms like Round Robin and Least Connections to distribute massive workloads evenly. We engineered High Availability by implementing rigorous Health Checks, allowing our systems to automatically failover and route around exploding servers. By leveraging Reverse Proxies for security and SSL Termination, we shielded our fragile application logic from the brutal realities of the public internet.

16. Next Chapter Recommendation

Our infrastructure is scaled and routed. Now we must define the exact language our applications use to talk to each other. Proceed to Chapter 7: APIs and Communication.

Browse All 21+ Subject Areas

Quick Links

Visual Algorithm Labs

Frontend Dev

The Future of Web Architecture in 2026

Practice Quizzes

Load Balancing #

1. Introduction #

2. Learning Objectives #

3. What is a Load Balancer? #

4. Routing Algorithms #

5. Health Checks and Failover #

6. Reverse Proxies (Security and SSL) #

7. Diagrams/Visual Suggestions #

8. Best Practices #

9. Common Mistakes #

10. Mini Project: Nginx Reverse Proxy Setup #

11. Practice Exercises #

12. MCQs with Answers #

A Load Balancer is constantly sending small "ping" requests to every Web Server in its fleet every 5 seconds. What is the architectural purpose of this action?

To prevent web servers from wasting massive amounts of CPU power encrypting and decrypting secure HTTPS traffic, system architects often configure the Load Balancer to decrypt the traffic at the perimeter and send raw HTTP to the internal servers. What is this technique called?

13. Interview Questions #

14. FAQs #

15. Summary #

16. Next Chapter Recommendation #

Finish this Chapter

Discussion

Explore More

📖 Related Tutorials 5

🎥 Related Videos 1

Send Feedback / Bug

Feedback Submitted!

Load Balancing

1. Introduction

2. Learning Objectives

3. What is a Load Balancer?

4. Routing Algorithms

5. Health Checks and Failover

6. Reverse Proxies (Security and SSL)

7. Diagrams/Visual Suggestions

8. Best Practices

9. Common Mistakes

10. Mini Project: Nginx Reverse Proxy Setup

11. Practice Exercises

12. MCQs with Answers

13. Interview Questions

14. FAQs

15. Summary

16. Next Chapter Recommendation