Table of Contents
Introduction
Nginx is widely known as a high performance web server and reverse proxy. One of its most powerful roles in server environments is to act as a load balancer, distributing client traffic across multiple backend servers. This chapter focuses on using Nginx specifically as a load balancer, how to configure it, and what options and strategies it offers.
Core Load Balancing Concepts in Nginx
When Nginx acts as a load balancer, it typically listens for client connections on a public interface, then proxies each request to one of several backend servers. Nginx calls these backends upstream servers. The configuration uses an upstream block to define the backend pool, and a server block with location directives to route incoming requests to that pool.
At a high level, the flow is: a client connects to Nginx, Nginx chooses a backend from the upstream group using a balancing algorithm, then forwards the HTTP request to that backend and relays the response back to the client.
Nginx supports several load balancing methods, both built in and via optional modules. These methods determine how Nginx selects which backend to use for each request.
Basic Upstream Configuration
To configure Nginx as a load balancer, you first define an upstream group. This creates a named pool of backend servers that can be reused in one or more server blocks.
A very simple configuration might look like this:
http {
upstream backend_app {
server 10.0.0.2:8080;
server 10.0.0.3:8080;
server 10.0.0.4:8080;
}
server {
listen 80;
server_name example.com;
location / {
proxy_pass http://backend_app;
}
}
}
Here the upstream group backend_app contains three application servers. The proxy_pass directive instructs Nginx to forward matching traffic to that group. By default, Nginx uses a round robin method to cycle through the servers.
Nginx upstream groups are defined inside the http context, and each server directive inside an upstream block must specify an address and port. Forgetting the port, or placing upstream inside a server block, will result in configuration errors.
Built-in Load Balancing Methods
Nginx supports several built in balancing strategies. The most important ones are round robin, least connections, and IP hash.
Round Robin
If you simply list servers in an upstream block without any special parameters, Nginx uses round robin. Each new connection is sent to the next server in the list, roughly distributing load evenly over time.
You do not need to specify any extra directives for this:
upstream backend_rr {
server 10.0.0.2:8080;
server 10.0.0.3:8080;
server 10.0.0.4:8080;
}Round robin works best when all backend servers are similar in capacity and performance, and when requests are roughly equal in cost.
Least Connections
The least_conn method sends each new connection to the server that currently has the fewest active connections. This is useful when some requests take longer than others, since it adapts to varying request durations.
You enable it by adding least_conn; inside the upstream block:
upstream backend_least_conn {
least_conn;
server 10.0.0.2:8080;
server 10.0.0.3:8080;
server 10.0.0.4:8080;
}
With least_conn Nginx tracks how many connections each server is currently handling. The next request is assigned to the server with the lowest count. This can improve utilization when workloads are uneven.
IP Hash
The ip_hash method aims to send all requests from the same client IP address to the same backend server. This is a basic form of session stickiness.
You enable it with:
upstream backend_ip_hash {
ip_hash;
server 10.0.0.2:8080;
server 10.0.0.3:8080;
server 10.0.0.4:8080;
}Nginx uses a hash of the client IP to pick a specific backend. As long as the upstream list does not change and the client IP remains the same, the client will usually hit the same backend server.
IP hash can help when the backend application stores sessions in local memory and you cannot use a shared session store. However, it does not work well when many users share the same IP, for example behind NAT, because they will all be sent to the same backend.
Weighting and Server Parameters
Not all backend servers are equal. Some may have more CPU or memory. Nginx allows you to assign weights to servers so that more powerful nodes receive more traffic.
To control weights, you add a weight parameter to each server directive:
upstream backend_weighted {
server 10.0.0.2:8080 weight=3;
server 10.0.0.3:8080 weight=2;
server 10.0.0.4:8080 weight=1;
}With this configuration, Nginx will send roughly 3 parts of traffic to 10.0.0.2, 2 parts to 10.0.0.3, and 1 part to 10.0.0.4. In round robin mode, Nginx cycles through servers according to these weights.
You can also adjust how Nginx treats problematic or slow servers. Common parameters include max_fails, fail_timeout, and backup.
For example:
upstream backend_advanced {
server 10.0.0.2:8080 weight=3 max_fails=2 fail_timeout=30s;
server 10.0.0.3:8080 weight=2;
server 10.0.0.4:8080 backup;
}Here:
max_fails sets how many failed attempts are allowed before the server is temporarily marked as down.
fail_timeout defines the time window for counting failures and how long the server remains considered down.
backup marks a server as a backup that only receives traffic when all primary servers are unavailable.
If you do not configure timeouts and failure limits, Nginx may continue to send traffic to an unhealthy server. Always test and tune max_fails, fail_timeout, and other parameters to match your environment and health expectations.
HTTP Proxy Configuration Details
To use Nginx as an HTTP load balancer, you combine upstream definitions with proxy_pass and appropriate proxy headers. A typical configuration for proxying HTTP looks like this:
upstream backend_app {
least_conn;
server app1.internal:8080;
server app2.internal:8080;
}
server {
listen 80;
server_name app.example.com;
location / {
proxy_pass http://backend_app;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_connect_timeout 5s;
proxy_send_timeout 30s;
proxy_read_timeout 30s;
}
}In this example, Nginx passes through important client information using HTTP headers. The backend app can then log the original client IP and protocol instead of seeing only the Nginx address.
Connection and read timeouts protect clients from hanging indefinitely if a backend is slow or unresponsive. These values should be tuned to the expected performance and latency of your application.
Health Checks and Failure Handling
Nginx has built in concepts of failures based on connection errors and timeouts. If a server fails too many times according to max_fails and fail_timeout, Nginx temporarily avoids sending traffic there. This is a simple passive health checking mechanism.
A typical configuration to use passive checks might look like:
upstream backend_passive {
server app1.internal:8080 max_fails=3 fail_timeout=10s;
server app2.internal:8080 max_fails=3 fail_timeout=10s;
}
With this setup, if a server repeatedly fails within 10 seconds, Nginx will mark it as down for the duration of fail_timeout. If that server later responds successfully again, Nginx will bring it back into rotation automatically.
More advanced active health checks, where Nginx proactively polls servers with HTTP requests, require additional modules or Nginx Plus. In many open source deployments, passive health checks combined with monitoring tools are used instead.
Session Persistence Options
Certain applications rely on sticky sessions where subsequent requests from the same client must reach the same backend. Nginx provides several approaches for session persistence.
The simplest native method is ip_hash, as described earlier, which uses client IP address as a key. While simple, it can be inaccurate when many users share IPs or when clients switch networks.
A more flexible approach uses application layer techniques, such as sharing session data in a database or cache, which removes the need for load balancer stickiness. However, when you must enforce stickiness at Nginx level with community modules, cookie based approaches can be used, but they are outside the built in feature set and require additional components.
In situations where you do not have shared session storage and cannot change the application, ip_hash remains the primary built in option. When using it, you must understand its limitations: changes to the upstream server list can reassign users, and multi IP environments can break the mapping.
SSL Termination and Load Balancing
Nginx can terminate TLS connections from clients, then forward plain HTTP traffic to backend servers. This reduces encryption workload on the backend and simplifies certificate management. You configure Nginx to listen on port 443 with SSL parameters and still use upstream balancing behind it.
An example configuration for HTTPS termination with load balancing is:
upstream backend_secure {
least_conn;
server app1.internal:8080;
server app2.internal:8080;
}
server {
listen 443 ssl;
server_name secure.example.com;
ssl_certificate /etc/nginx/ssl/secure.crt;
ssl_certificate_key /etc/nginx/ssl/secure.key;
location / {
proxy_pass http://backend_secure;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto https;
}
}
In this setup, Nginx receives encrypted traffic, decrypts it, and forwards the request unencrypted to the backend. The X-Forwarded-Proto header allows the backend to know that the original connection from the client used HTTPS.
If you need end to end encryption, you can configure Nginx to proxy to HTTPS backends instead, by using proxy_pass https://backend_name; and configuring the upstream to use SSL.
TCP/UDP Load Balancing with Stream
Although Nginx is often used for HTTP, it can also load balance arbitrary TCP and UDP protocols through the stream context. This is useful for databases, mail protocols, or custom binary services.
A basic TCP load balancing configuration looks like: