Skip to main content
Google Cloud Platform (GCP)
CHAPTER 07

Load Balancing and High Availability

Updated: May 15, 2026
25 min read

# CHAPTER 7

Load Balancing and High Availability

1. Introduction

If you run a successful e-commerce store on a single Virtual Machine, you have a massive problem: A Single Point of Failure (SPOF). If that VM crashes, or if too many customers try to checkout at once, your website goes offline and you lose money. To achieve Enterprise-grade High Availability, you must run multiple VMs simultaneously and place a Cloud Load Balancer in front of them. In this chapter, we will learn how to distribute traffic globally and configure Health Checks to ensure traffic never goes to a dead server.

2. Learning Objectives

By the end of this chapter, you will be able to:
  • Define High Availability (HA) and eliminate Single Points of Failure.
  • Understand the function of a Cloud Load Balancer.
  • Differentiate between Global vs. Regional Load Balancers.
  • Configure Instance Groups (Managed and Unmanaged).
  • Configure Health Checks to autonomously reroute traffic.

3. Beginner-Friendly Explanation

Imagine a busy bank.
  • Single Server: There is exactly one teller working. If 50 people walk in, the line is out the door. If the teller goes to lunch, the bank is closed.
  • Instance Groups: The bank hires 5 identical tellers to work at the same time.
  • The Load Balancer: The security guard at the front door. When a customer walks in, the guard looks at the 5 tellers. "Teller 1 is busy. Teller 2 is free. Go to Teller 2!"
  • Health Checks: The guard constantly asks the tellers, "Are you awake?" If Teller 3 falls asleep, the guard stops sending customers to Teller 3 and routes them to the other 4 awake tellers.

4. Instance Groups

Before you can load balance traffic, you need a group of identical servers.
  • Unmanaged Instance Groups: A static list of VMs you manually added.
  • Managed Instance Groups (MIGs): The production standard. You give GCP a "Template" of a VM. You tell the MIG: "I always want exactly 3 VMs running." If a VM crashes, the MIG automatically uses the template to create a brand new VM to replace it. (This also enables Autoscaling!).

5. Types of Cloud Load Balancers

GCP offers several types of Load Balancers depending on your architecture:
  1. 1. Global HTTP(S) Load Balancer: The most popular. It provides a single, global "Anycast" IP address. If a user in Tokyo hits the IP, they are routed to your VMs in Tokyo. If a user in New York hits the *exact same IP*, they are routed to your VMs in New York.
  1. 2. Regional Network Load Balancer: Operates at the TCP/UDP layer (not HTTP). Used for non-web traffic, like a multiplayer gaming server or an internal database proxy.
  1. 3. Internal Load Balancer: Balances traffic *inside* your VPC, totally hidden from the internet. Used to balance traffic between your Frontend VMs and your Backend API VMs.

6. Health Checks

A Load Balancer is useless if it sends a customer to a crashed server. You configure a Health Check to ping your servers every 5 seconds (e.g., GET /health).
  • If a server responds with HTTP 200 OK, it receives traffic.
  • If it responds with HTTP 500 Error (or times out), the Load Balancer instantly marks it "Unhealthy" and silently reroutes all customer traffic to the surviving servers.

7. Mini Project: Conceptual Load Balancing Architecture

Setting up a Global Load Balancer in the console takes about 15 minutes of clicking. Let's outline the conceptual steps required:

Step-by-Step Overview:

  1. 1. The Blueprint: Create an *Instance Template* defining your e2-micro VM with an Apache web server startup script.
  1. 2. The Group: Create a *Managed Instance Group (MIG)* using that template. Set it to deploy 3 VMs across 3 different Zones in us-central1.
  1. 3. The Balancer: Navigate to Network Services > Load balancing. Click Create.
  1. 4. Backend Configuration: Point the Load Balancer to your MIG. Attach a Health Check (pinging Port 80).
  1. 5. Frontend Configuration: The Load Balancer provisions a single, Static Public IP address.
  1. 6. The Result: You give the Load Balancer's IP address to your customers. They hit the Load Balancer, and it seamlessly distributes the traffic across your 3 VMs. If you manually delete one VM, the MIG replaces it, and the Load Balancer resumes sending it traffic once it passes the Health Check.

8. Real-World Scenarios

A global streaming service launches a new movie. They use a Global HTTP(S) Load Balancer. A user in Paris opens the app. The Load Balancer detects their location and routes them to a server in the europe-west9 (Paris) region to guarantee the lowest possible latency. Suddenly, a fiber optic cable is cut, and the entire Paris data center goes offline. The Load Balancer's Health Checks fail. Within milliseconds, it automatically reroutes the Parisian user to the europe-west1 (Belgium) data center. The user experiences a slight stutter, but the movie keeps playing.

9. Best Practices

  • Cloud CDN: When configuring a Global HTTP(S) Load Balancer, always check the box to enable Cloud CDN (Content Delivery Network). Google will automatically cache your static assets (images, CSS) at the edge of their network, closest to the user. This makes your website lightning fast and drastically reduces the CPU load on your actual VMs.

10. Common Mistakes

  • Firewalling the Health Check: The Load Balancer itself does not live in your VPC; it lives on Google's edge network. To perform Health Checks, Google's probe servers must reach your VMs. You MUST create a VPC Firewall Rule allowing ingress traffic on Port 80 from Google's specific probe IP ranges (130.211.0.0/22 and 35.191.0.0/16). If you block these, the Load Balancer thinks all your servers are dead and drops 100% of your traffic!

11. Exercises

  1. 1. What is a Single Point of Failure (SPOF)? How does an architecture utilizing a Managed Instance Group mitigate it?
  1. 2. Why is it advantageous to use a single Global Anycast IP address instead of multiple Regional IP addresses for a worldwide application?

12. FAQs

Q: Does a Load Balancer cost money? A: Yes. You pay a flat rate for the Load Balancer itself (around $18/month), plus a tiny fee for the amount of data processed through it. It is absolutely worth the cost for production availability.

13. Interview Questions

  • Q: Describe the architectural relationship between an Instance Template, a Managed Instance Group (MIG), a Health Check, and an HTTP(S) Load Balancer.
  • Q: A newly deployed HTTP Load Balancer is returning 502 Bad Gateway errors, despite the backend VMs running perfectly when accessed directly via their internal IPs. Identify the most likely networking misconfiguration causing this failure. *(Hint: Health Check Firewalls!)*

14. Summary

In Chapter 7, we achieved Enterprise High Availability. We eliminated Single Points of Failure by transitioning from standalone VMs to resilient Managed Instance Groups (MIGs). We introduced the Cloud Load Balancer as the intelligent traffic cop, utilizing Global Anycast IPs to route users to the geographically closest server. Finally, we established autonomous self-healing via Health Checks, ensuring our applications remain online even during catastrophic server crashes.

15. Next Chapter Recommendation

We have a Load Balancer with a public IP address (e.g., 34.120.45.67). But humans do not memorize IP addresses; they type google.com. Proceed to Chapter 8: Google Cloud DNS.

Finish this Chapter

Save your progress on your learning path and prepare for coding interview challenges.

Discussion

Join the discussion

Log in or create a free account to participate.

Sort: ·