5.5.2 Nginx as a load balancer

Overview

Nginx is widely used as a high‑performance HTTP/TCP/UDP load balancer. In this chapter you will focus on:

Nginx load‑balancing primitives (upstream, proxy_pass, stream)
Load‑balancing algorithms and their Nginx specifics
Health checks (basic and “active”)
Session persistence with Nginx
SSL termination and passing client information
Advanced patterns and tuning considerations

Basic web server concepts, reverse proxy concepts, and general load‑balancing theory are covered elsewhere; here we focus on Nginx’s role and configuration.

Basic HTTP Load Balancing with `upstream`

The core element for HTTP load balancing in Nginx is the upstream block. It defines a group of backend servers and how Nginx distributes requests across them.

Minimal example:

http {
    upstream app_backend {
        server 10.0.0.11;
        server 10.0.0.12;
    }
    server {
        listen 80;
        server_name example.com;
        location / {
            proxy_pass http://app_backend;
        }
    }
}

Key points:

upstream app_backend { ... } defines backend nodes.
proxy_pass http://app_backend; sends matching requests to that group.
By default, Nginx uses a round‑robin algorithm over the listed servers.

Server directives in `upstream`

Each server line inside an upstream can have parameters that influence balancing behavior:

upstream app_backend {
    # weight – relative share of traffic
    server 10.0.0.11 weight=3;
    server 10.0.0.12 weight=1;
    # Mark a server as down (for maintenance)
    server 10.0.0.13 down;
    # Passive health check tuning
    server 10.0.0.14 max_fails=3 fail_timeout=30s;
}

Common parameters:

weight=N – server gets $N$ times more requests than weight 1 servers.
max_fails=N – how many failed attempts (timeouts, 5xx) before Nginx temporarily marks the server as failed.
fail_timeout=TIME – time period during which max_fails failures are counted and the duration for which the server is considered “down” after exceeding max_fails.
down – permanently disabled in that configuration; useful for draining traffic.

These are passive health controls: they react to failed requests from real clients.

HTTP Load‑Balancing Methods

Nginx supports several load‑balancing algorithms (methods). Some are built into the open‑source version; a few require Nginx Plus (commercial), which we’ll note explicitly.

Round Robin (default)

No directive needed; Nginx cycles through servers in order, respecting weight.

Use cases:

Most general scenario where backends are “equivalent.”
Works well with many stateless web apps.

Least Connections

Directs traffic to the server with the fewest active connections:

upstream app_backend {
    least_conn;
    server 10.0.0.11;
    server 10.0.0.12;
}

Useful when:

Backend requests vary significantly in duration.
You want to avoid overloading a server that already has many long‑running requests.

IP Hash

Tries to send requests from the same client IP to the same backend, providing basic session affinity:

upstream app_backend {
    ip_hash;
    server 10.0.0.11;
    server 10.0.0.12;
    # server 10.0.0.13 down;  # allowed with ip_hash
}

Notes:

Good when you can’t or don’t want to store sessions centrally.
Breaks if clients share IPs (NAT, corporate proxies).
Adding/removing servers changes hash distribution, which can disrupt sessions.

Hash (Key‑Based Balancing)

More flexible than ip_hash; you define the key to hash on:

upstream app_backend {
    hash $request_uri consistent;
    server 10.0.0.11;
    server 10.0.0.12;
}

Key options:

hash $request_uri; – same URI goes to same backend.
hash $cookie_sessionid; – consistent routing by session cookie.
consistent – enables consistent hashing (minimizes key movement when servers change).

Useful for:

Cache sharding across backends.
Controlling affinity based on cookies, headers, or any Nginx variable.

Proxying HTTP Traffic

Once an upstream is defined, proxy_pass does the forwarding. Important directives in location blocks:

location / {
    proxy_pass         http://app_backend;
    proxy_set_header   Host $host;
    proxy_set_header   X-Real-IP $remote_addr;
    proxy_set_header   X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header   X-Forwarded-Proto $scheme;
    proxy_connect_timeout   5s;
    proxy_send_timeout      30s;
    proxy_read_timeout      30s;
    proxy_http_version 1.1;
    proxy_set_header   Connection "";
}

Key considerations:

Preserve original Host header for virtual host‑aware apps.
Pass original client IP and protocol to backends.
Tune timeouts to avoid holding connections too long.
Use proxy_http_version 1.1 and Connection "" to support keepalive connections to upstreams.

Health Checks

Passive Health Checks (Open Source)

Passive checks are enabled by default; failures increment the failure count:

upstream app_backend {
    server 10.0.0.11 max_fails=3 fail_timeout=30s;
    server 10.0.0.12 max_fails=3 fail_timeout=30s;
}

Behavior:

If a backend fails max_fails times within fail_timeout, Nginx marks it as failed for fail_timeout.
After fail_timeout, Nginx tries it again.

Limitations:

Backends are only probed when clients send requests; no background probing.
First few clients that hit a failing backend experience errors.

Active Health Checks (Nginx Plus)

Nginx Plus adds true active health monitoring:

Dedicated health check locations
HTTP status/content checks
Interval and jitter controls

Conceptually:

upstream app_backend {
    zone app_backend 64k;
    server 10.0.0.11;
    server 10.0.0.12;
    health_check interval=5s fails=3 passes=2 uri=/healthz;
}

In open source Nginx, similar functionality can be approximated using:

External monitors that reconfigure Nginx and reload
Upstreams managed by service discovery (e.g., Consul, etcd) and templates

Session Persistence (Sticky Sessions)

Session persistence is needed when your application relies on backend‑local state (e.g., in‑memory sessions).

Approaches in Open Source Nginx

ip_hash
Very simple, works only by client IP; not ideal when IP is shared.
Consistent hash on a cookie (preferred):

   upstream app_backend {
       hash $cookie_sessionid consistent;
       server 10.0.0.11;
       server 10.0.0.12;
   }

Workflow:

App sets a session cookie like sessionid=abc123.
Nginx hashes $cookie_sessionid and routes to a backend.
As long as the cookie is stable and servers don’t change often, affinity is preserved.

Sticky module (3rd‑party)
Some distros/package builds ship sticky or sticky_cookie_insert modules. Usage depends on package; verify documentation and stability before production use.

Nginx Plus Sticky Sessions

Nginx Plus includes official sticky directives that enable:

Cookie‑based persistence
Route‑based persistence
Time‑limited lifetime

If you need fully supported, advanced session persistence, this is one of the major Plus drivers.

SSL/TLS Termination and Load Balancing

Nginx is commonly used for SSL termination in front of non‑TLS backends.

Example: HTTPS at the edge, HTTP to backends:

upstream app_backend {
    server 10.0.0.11:8080;
    server 10.0.0.12:8080;
}
server {
    listen 443 ssl http2;
    server_name example.com;
    ssl_certificate     /etc/nginx/ssl/example.crt;
    ssl_certificate_key /etc/nginx/ssl/example.key;
    # (tune ssl_protocols, ciphers, etc. in practice)
    location / {
        proxy_pass http://app_backend;
        proxy_set_header  Host $host;
        proxy_set_header  X-Real-IP $remote_addr;
        proxy_set_header  X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header  X-Forwarded-Proto https;
    }
}

Key points:

Offload CPU‑intensive TLS from backends to Nginx.
Backends can run HTTP only, which simplifies their config.
Use X-Forwarded-Proto and possibly X-Forwarded-Host so app can reconstruct original URL.
Combine with HSTS, OCSP stapling, and modern ciphers for security.

TLS re‑encryption (HTTPS to backends) is also possible by using proxy_pass https://... and configuring proxy_ssl_* directives.

TCP/UDP Load Balancing with the `stream` Module

Nginx can load balance generic TCP and UDP traffic (e.g., databases, mail, custom protocols) via the stream context.

Example: Load‑balancing MySQL traffic:

stream {
    upstream mysql_backend {
        least_conn;
        server db1.internal:3306 max_fails=3 fail_timeout=30s;
        server db2.internal:3306 max_fails=3 fail_timeout=30s;
    }
    server {
        listen 3306;
        proxy_pass mysql_backend;
    }
}

For UDP (e.g., DNS):

stream {
    upstream dns_servers {
        server 10.0.0.11:53;
        server 10.0.0.12:53;
    }
    server {
        listen 53 udp;
        proxy_pass dns_servers;
    }
}

Considerations:

No HTTP awareness; it’s pure L4 proxying.
Some methods (e.g., least_conn) are applicable in stream.
Timeouts and tuning use proxy_connect_timeout, proxy_timeout, etc., but in the stream context.

Advanced Configuration Patterns

Per‑Location Upstreams

You can route different URL paths to different upstreams:

upstream api_backend {
    server 10.0.0.21;
    server 10.0.0.22;
}
upstream web_backend {
    server 10.0.0.31;
    server 10.0.0.32;
}
server {
    listen 80;
    server_name example.com;
    location /api/ {
        proxy_pass http://api_backend;
    }
    location / {
        proxy_pass http://web_backend;
    }
}

Useful for:

Splitting monolith vs microservice traffic
Versioned APIs (/v1/, /v2/) going to different clusters

Circuit‑Breaker‑Like Behavior with `proxy_next_upstream`

proxy_next_upstream can help skip failing servers and retry on others:

location / {
    proxy_pass http://app_backend;
    proxy_next_upstream error timeout invalid_header http_500 http_502 http_503 http_504;
    proxy_next_upstream_tries 2;
}

Behavior:

On specified errors, Nginx retries the request on a different backend, up to proxy_next_upstream_tries.
Be careful with non‑idempotent methods (e.g., POST) to avoid duplicate actions on the backend.

Connection Reuse and Keepalive to Upstreams

Reduce latency and resource usage by reusing TCP connections to backends:

upstream app_backend {
    server 10.0.0.11;
    server 10.0.0.12;
    keepalive 64;
}
server {
    listen 80;
    location / {
        proxy_http_version 1.1;
        proxy_set_header Connection "";
        proxy_pass http://app_backend;
    }
}

Notes:

keepalive N defines the number of idle connections Nginx will keep to each server.
Connection "" clears the header so keepalive is used.
Good for high‑traffic environments with many short‑lived HTTP requests.

Observability and Metrics

Nginx can surface load‑balancer‑related statistics that are critical in production.

Stub Status (Open Source)

Basic connections and request stats:

location /nginx_status {
    stub_status;
    allow 127.0.0.1;
    deny all;
}

This is per‑worker and per‑instance; you combine with external monitoring for cluster‑wide dashboards.

Extended Metrics (Nginx Plus)

Nginx Plus includes a status API (/api) exposing:

Per‑upstream, per‑server connection counts and health
Request, error, and bandwidth statistics
JSON output for integration with Prometheus, Grafana, etc.

In open source, similar visibility can be approximated by:

Access logs with custom formats
error_log levels and log parsing
External TCP/HTTP health monitors

Common Pitfalls and Tuning Tips

Time‑outs too high: Clients may hang for a long time when a backend fails. Set realistic proxy_connect_timeout, proxy_send_timeout, proxy_read_timeout.
Incorrect client IP: If you don’t set X-Real-IP and X-Forwarded-For, backends only see Nginx’s IP, breaking rate‑limiting and logging.
Session persistence vs scaling: IP or cookie‑based stickiness can fight autoscaling; combine with centralized session storage when possible.
Large request bodies: Tune client_max_body_size and buffering (client_body_buffer_size, proxy_request_buffering) to avoid unexpected 413 errors or memory pressure.
Dynamic backends: If backend list changes frequently (Kubernetes, autoscaling groups), use templating or service discovery integrations to regenerate upstream blocks and reload Nginx gracefully.

Minimal End‑to‑End Example

A simple, production‑adjacent HTTP load‑balancer config tying concepts together:

user  nginx;
worker_processes auto;
events {
    worker_connections 1024;
}
http {
    # Logging
    log_format  main  '$remote_addr - $remote_user [$time_local] '
                      '"$request" $status $body_bytes_sent '
                      '"$http_referer" "$http_user_agent" '
                      'upstream:$upstream_addr rt=$request_time urt=$upstream_response_time';
    access_log  /var/log/nginx/access.log main;
    upstream app_backend {
        least_conn;
        server 10.0.0.11 max_fails=3 fail_timeout=30s;
        server 10.0.0.12 max_fails=3 fail_timeout=30s;
        keepalive 64;
    }
    server {
        listen 80;
        server_name example.com;
        location /health {
            return 200 "OK\n";
        }
        location / {
            proxy_pass http://app_backend;
            proxy_set_header Host              $host;
            proxy_set_header X-Real-IP         $remote_addr;
            proxy_set_header X-Forwarded-For   $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto $scheme;
            proxy_http_version 1.1;
            proxy_set_header Connection "";
            proxy_connect_timeout 5s;
            proxy_send_timeout    30s;
            proxy_read_timeout    30s;
            proxy_next_upstream error timeout http_502 http_503 http_504;
            proxy_next_upstream_tries 2;
        }
    }
}

This configuration:

Defines an upstream with least‑connections balancing and passive health checks
Reuses connections to backends
Preserves client information
Has basic retry behavior on upstream errors
Logs enough data to reason about load‑balancer performance and issues

From here, you can extend with HTTPS termination, session persistence, or TCP/UDP stream balancing according to your application’s needs.

Comments

Please login to add a comment.

Don't have an account? Register now!