Scaling WebSockets: How to Handle 1 Million Concurrent Connections

The "C10K problem" is ancient history. Today, we talk about the C1M problem: handling 1 million concurrent connections on a single server.

If you are building a real-time application—whether it's a chat app, a live sports feed, or a collaborative editing tool—you will eventually hit a wall. Your server will crash, new connections will time out, and your latency will spike. And the culprit is rarely your application code. It's usually the OS configuration.

In this deep dive, we will walk through the exact steps to tune a Linux server to handle 1 million concurrent WebSocket connections. We will cover file descriptors, ephemeral ports, and the architecture required to scale beyond a single node.

The Bottleneck is Not Node.js (Usually)

Many developers assume that Node.js (or Python/Ruby) is too slow to handle millions of connections. While it's true that a single thread has limits, the first bottleneck you will hit is almost always the Operating System.

By default, Linux is configured for general-purpose computing, not for maintaining millions of persistent TCP connections.

1. The "Too Many Open Files" Error

In Linux, everything is a file. A socket is a file. A file on disk is a file. When a process opens a connection, it consumes a file descriptor (FD).

Check your current limit:

ulimit -n
# Output: 1024 (usually)

If you try to open 1,025 connections, your application will crash with EMFILE: too many open files.

The Fix:

You need to increase both the system-wide limit and the per-process limit.

Edit /etc/sysctl.conf:

fs.file-max = 2097152

Edit /etc/security/limits.conf:

*      soft    nofile      1048576
*      hard    nofile      1048576
root   soft    nofile      1048576
root   hard    nofile      1048576

Apply the changes with sysctl -p. Now your OS allows enough file descriptors. But we are just getting started.

2. The Ephemeral Port Exhaustion Trap

This is the most common "silent killer" of high-scale WebSocket architectures.

When a client connects to a server, it uses a source IP and a source port. When your server connects to a backend database or another service, it becomes the client and uses a local port.

There are only 65,535 ports available. If you have a load balancer or proxy in front of your WebSocket server, you might run out of ports for outgoing connections.

The TCP Tuple

A TCP connection is identified by a 4-tuple:
{Source IP, Source Port, Destination IP, Destination Port}

If your load balancer connects to your WebSocket server, and both are on specific IPs, you are limited by the number of available source ports (approx 60k).

The Fix:

Increase the local port range:

# /etc/sysctl.conf
net.ipv4.ip_local_port_range = 1024 65535

Enable TCP Reuse:
```
net.ipv4.tcp_tw_reuse = 1
```
This allows the OS to reclaim ports in the TIME_WAIT state more quickly.

3. Memory Usage: The Real Cost of a Socket

How much RAM does 1 million connections consume?

In the past, a connection might cost a few kilobytes of kernel memory. In modern kernels, it's much more efficient, but application memory is the bigger factor.

If you use Node.js, every Socket object takes up heap space.

Empty Socket: ~2-4 KB
With Metadata (User ID, Channel info): ~10 KB

Math Time:
1,000,000 connections * 10 KB = 10 GB of RAM.

This is feasible on a modern server, but you must be careful.

Optimization Tip:
Don't store the entire User object on the socket. Store only the userId and fetch details from Redis when needed.

// BAD
socket.user = { id: 1, name: "Alice", email: "...", bio: "..." };

// GOOD
socket.userId = 1;

4. The Event Loop Lag

Node.js is single-threaded. If you have 1 million connected users, and you try to broadcast a message to all of them, you will block the event loop.

// DO NOT DO THIS
users.forEach(user => {
  user.socket.send("Hello!");
});

Sending 1 million packets takes time. If it takes 0.01ms per send, that's 10 seconds of blocking time. Your server will be unresponsive for 10 seconds.

The Fix: Batching and Workers

Don't broadcast to everyone at once. Chunk your broadcasts.
Use multiple processes. Use Node.js cluster module or run multiple container instances.

5. Architecture for Infinite Scale

A single server can handle 1M connections with tuning. But what about 10M? 100M?

You need a Distributed WebSocket Architecture.

The Layered Approach

Load Balancer (Nginx/HAProxy): Terminates SSL, distributes connections to backend nodes.
WebSocket Nodes (Node.js/Go): Hold the active connections. They are "stateful" in the sense that they hold the socket, but "stateless" regarding business logic.
Pub/Sub Layer (Redis/NATS): The glue that holds it all together.

How Pub/Sub Works

When User A (connected to Server 1) wants to send a message to User B (connected to Server 2):

User A sends message to Server 1.
Server 1 publishes the message to a Redis channel: publish('user:B', payload).
Server 2 (and all other servers) are subscribed to Redis.
Server 2 receives the message, checks if User B is connected locally.
Server 2 sends the message to User B.

This architecture allows you to add servers horizontally without limits.

6. Testing Your Limits

Don't wait for production to fail. You need to test this.

Tools like Artillery or k6 are great, but for massive concurrency, you might need a fleet of client machines.

Tsung is an Erlang-based distributed load testing tool that is excellent for simulating millions of concurrent users.

Conclusion

Handling 1 million connections is a badge of honor for system engineers. It requires leaving the comfort zone of "npm install" and diving into Linux internals.

Summary Checklist:

Increase ulimit (Open Files).
Tune sysctl (Port range, TCP reuse).
Optimize Application Memory (Store IDs, not Objects).
Use a Pub/Sub backend (Redis) for horizontal scaling.

The next time your server crashes under load, don't just upgrade the instance size. Look at the kernel. The answer is usually there.