CHAPTER 16
Beginner
Scaling WebSocket Applications
Updated: May 14, 2026
20 min read
# CHAPTER 16
Scaling WebSocket Applications
1. Introduction
Building a WebSocket server for 50 users is simple. Building one for 500,000 users is an engineering marvel. Because WebSockets require persistent, open connections, they consume a significant amount of server RAM. When a single server is full, you must add more servers. But how do users on Server A talk to users on Server B? In this chapter, we will explore the concepts of horizontal scaling, load balancing, and Pub/Sub systems.2. Learning Objectives
By the end of this chapter, you will be able to:- Explain why scaling WebSockets is harder than scaling HTTP.
- Differentiate between Vertical and Horizontal scaling.
- Understand the role of a Load Balancer with "Sticky Sessions."
- Describe how Redis Pub/Sub connects multiple WebSocket servers.
3. Beginner-Friendly Explanation
Imagine a single telephone operator (Server A). She can connect calls for 100 people at once. If 200 people need to make calls, she will be overwhelmed (Server Crash). To fix this, we hire a second operator (Server B). We also put a manager (Load Balancer) at the front door. The manager sends the first 100 people to Operator A, and the next 100 to Operator B. The Problem: What if Alice (talking to Operator A) wants to chat with Bob (talking to Operator B)? Operator A doesn't know Bob! The Solution (Redis Pub/Sub): The operators put on headsets and talk to a central dispatcher. When Alice sends a message, Operator A shouts to the dispatcher, "Message for Bob!" The dispatcher tells Operator B, who passes it to Bob.4. Real-World Examples
- WhatsApp Architecture: Millions of users are connected across thousands of servers globally. An intricate Pub/Sub system routes messages between these servers instantly.
5. Vertical vs Horizontal Scaling
- Vertical Scaling (Scaling Up): Buying a bigger server with more RAM and CPU. It is easy to do, but eventually, you hit the limit of physical hardware.
- Horizontal Scaling (Scaling Out): Adding *more* servers (Server 1, Server 2, Server 3). This offers infinite scale, but requires complex architecture to sync data between them.
6. The Load Balancer
To scale out, you place a Load Balancer (like AWS ALB, HAProxy, or Nginx) in front of your servers. Client -> Load Balancer -> (Server A or Server B). Because the WebSocket handshake starts as HTTP, the Load Balancer needs to understand WebSockets so it doesn't sever the persistent connection.7. The Cross-Server Communication Problem
javascript
If Alice sends {"type": "chat", "to": "Bob", "msg": "Hi"}, Server A loops through its clients array. It doesn't find Bob. The message drops.
8. Redis Pub/Sub
To solve this, we introduce Redis, a lightning-fast, in-memory database that supports a pattern called "Publish/Subscribe" (Pub/Sub).-
1.
When Server A and Server B boot up, they both *Subscribe* to a Redis channel called
global_chat.
- 2. Alice sends a message to Server A.
- 3. Server A takes that message and *Publishes* it to Redis.
- 4. Redis instantly blasts that message out to all subscribed servers (A and B).
- 5. Server B receives it from Redis, sees it is for Bob, and sends it down Bob's WebSocket.
9. Architecture Diagram (Conceptual)
text
10. Best Practices
- Offload logic: WebSocket servers should be "dumb." They should only be responsible for holding open connections and passing messages to/from Redis. Do heavy database writing/reading in standard HTTP background workers.
- Sticky Sessions: Configure your Load Balancer for "Sticky Sessions" or IP Hashing. If a client's WebSocket drops and they reconnect 1 second later, the load balancer should ideally route them back to the exact same server they were just on.
11. Common Mistakes
- Using MySQL for Cross-Server Syncing: Do not try to write chat messages to a MySQL database on Server A and have Server B query the database every second to see if there are new messages. The database will crash under the load. You *must* use a fast Pub/Sub system like Redis or RabbitMQ.
12. Mini Exercises
- 1. Review the architecture diagram in Section 9.
- 2. Imagine 10 Servers instead of 2. If Alice sends a message to a massive Chat Room, how many times does Redis have to publish it? *(Answer: 10 times, once to each subscribed server. Then each server distributes it to its local connected users).*
13. Coding Challenges
Challenge 1: Research "Pusher" or "Ably" online. Write a brief paragraph explaining how using a managed WebSocket-as-a-Service provider solves the scaling problem for you.14. MCQs with Answers
Question 1
Why is horizontal scaling more difficult for WebSockets than for standard HTTP APIs?
Question 2
What is the primary role of Redis Pub/Sub in a scaled WebSocket architecture?
15. Interview Questions
- Q: Walk me through the architecture required to support 100,000 concurrent WebSocket connections.
- Q: Explain the Pub/Sub pattern and why it is critical for cross-server communication.