5.1.5 Reverse proxy concepts

Table of Contents

Core Ideas of a Reverse Proxy

A reverse proxy is a server that receives client requests and forwards them to one or more backend servers, then returns the responses as if they came directly from the proxy itself.

Key points that distinguish a reverse proxy from direct access to an application:

Clients connect only to the reverse proxy, never directly to the backend.
The backend servers can be hidden (private IPs, non-public networks).
The proxy can modify requests and responses (headers, URLs, compression, caching).
The same proxy can serve multiple sites and applications.

Do not confuse with a forward proxy (used by clients to reach arbitrary sites on the internet); a reverse proxy sits in front of your servers, not in front of the client.

Typical technologies used as reverse proxies:

Nginx
Apache HTTPD (mod_proxy, mod_proxy_http, etc.)
HAProxy
Envoy, Traefik, etc.

(Their specific configuration is covered in their own chapters.)

Common Use Cases

1. Central Entry Point and Load Balancing

A reverse proxy is often the “front door” to an application cluster.

Typical pattern:

Public DNS: app.example.com → reverse proxy IP
Reverse proxy forwards to one of several backend servers:

10.0.0.11:8080
10.0.0.12:8080
10.0.0.13:8080

Benefits:

Single public IP / hostname, multiple backends.
Horizontal scaling: add/remove backend nodes without changing DNS.
Health checks: remove unhealthy backends from the pool.
Different load-balancing strategies (round robin, least connections, etc.).

Load balancing logic is part of the reverse proxy; “Load Balancing” has its own section later, but conceptually:

Reverse proxy is the component that can implement HTTP load balancing.
It decides which backend to send a request to, based on policy and health.

2. TLS Termination (SSL Offload)

The reverse proxy often handles TLS encryption/decryption:

Client ↔ Reverse proxy: HTTPS (TLS)
Reverse proxy ↔ Backend: HTTP (plain) or HTTPS (internal CA)

Advantages:

Centralized certificate management:

One place to install/renew certificates (Let’s Encrypt, corporate CA, etc.).

Offload CPU-heavy TLS operations from application servers.
Unified TLS configuration (cipher suites, protocols, HSTS).

Some deployments also use end-to-end encryption:

Client → Proxy: HTTPS
Proxy → Backend: HTTPS

In that case the reverse proxy may re-encrypt traffic after inspecting/rewriting it.

3. HTTP Routing and Path-Based Dispatch

Reverse proxies can route traffic to different backends based on:

Hostname (virtual hosts): api.example.com vs www.example.com
URL path: /api/ vs /static/ vs /admin/
HTTP method: GET vs POST (less common, but possible)
Headers: X-Requested-With, User-Agent, etc.

Examples:

/api/ → backend API service (e.g. on port 9000)
/static/ → static file server or object storage proxy
/app1/ vs /app2/ → different applications behind the same domain

This kind of “smart routing” is a main reason modern applications centralize all traffic through a reverse proxy.

4. Application Firewalling and Security Layer

A reverse proxy acts as a security layer:

Hides topology: clients see only the proxy, not internal IPs and port structure.
IP-based and header-based access controls:

Allow admin interface only from certain subnets.
Block suspicious user agents or bad bots.

Integration with Web Application Firewalls (WAF):

Request inspection, rate limiting, signature-based attacks filtering.

Request normalization:

Strip/modify dangerous headers.
Enforce consistent URL encodings.

Example protective patterns:

Block large request bodies or very long URLs to avoid certain DoS styles.
Force HTTPS by redirecting all HTTP requests to HTTPS.
Implement global rate limiting (per IP/user) at the edge rather than in each app.

5. Caching and Performance Optimization

Reverse proxies can greatly improve performance:

Static caching: Images, CSS, JS, and other static assets.
Dynamic caching: Cache responses from dynamic endpoints with appropriate cache headers.
Compression:

Gzip/Brotli compression on responses.
Optionally uncompress-compress between backend and client.

Connection pooling / keep-alive:

Fewer TCP connections to backends.
Reuse persistent connections to reduce latency.

Effects:

Decreased load on application servers.
Reduced bandwidth usage.
Faster page load times, especially over slow networks.

6. Protocol Translation and Normalization

Reverse proxies can sit between clients and backends that “speak” slightly different HTTP dialects or even other protocols.

Examples:

HTTP/2 on the public side, HTTP/1.1 to the backend.
WebSocket upgrades:

Proxy handles Upgrade: websocket and forwards raw TCP stream.

GRPC over HTTP/2 on the edge, various internal microservices behind.
Terminate non-HTTP protocols at a proxy that supports them and convert to HTTP internally (depends on software).

They can also add or normalize headers, such as:

X-Forwarded-For – chain of client IPs through proxies.
X-Forwarded-Proto – original scheme (http or https).
X-Forwarded-Host – original Host header.
Standardized Forwarded header as defined by RFC 7239.

Backends must often be configured to trust these headers (or only trust them from known proxies) if they use them to determine client IP or scheme.

Reverse Proxy vs. Other Architectural Components

Reverse Proxy vs. Load Balancer

Many products are both; conceptually:

Load Balancer:

Focus on distributing traffic among multiple backends.
Often layer 4 (TCP) or layer 7 (HTTP) devices.

Reverse Proxy:

Focus on application-level features: routing, caching, rewriting, compression, WAF.

In practice:

A reverse proxy implements HTTP load balancing.
A network load balancer can send traffic to a fleet of reverse proxies.

Reverse Proxy vs. API Gateway

An API gateway is essentially a specialized reverse proxy for APIs, with extra concerns:

Authentication and authorization (API keys, OAuth2, JWT).
Request/response transformation, versioning, quotas, and analytics.
Multi-tenant routing between microservices.

Conceptually:

API gateways are domain-specific reverse proxies with additional features tailored to web APIs.

Reverse Proxy vs. Forward Proxy

Core differences:

Forward proxy:

Deployed near the client.
Client is usually configured to use it.
Used for outbound access control, caching web browsing, anonymization.

Reverse proxy:

Deployed near the server.
Client is often unaware; uses normal DNS and HTTP.
Used to protect, accelerate, and route to backend applications.

Typical Reverse Proxy Topologies

1. Single Reverse Proxy in Front of One Application

Simple pattern:

client → reverse proxy → app server

Use cases:

TLS termination.
Basic request logging and rate limiting.
Simplest step to introduce caching and unified access logs.

2. Single Reverse Proxy in Front of Multiple Applications

One proxy, many backends:

Different virtual hosts: app1.example.com, app2.example.com
Or path-based mapping: /app1/, /app2/

Advantages:

Centralized cross-cutting concerns: TLS, compression, headers, security.
Easier operations: one place to update domain-level settings.

3. Reverse Proxies Behind a Layer-4 Load Balancer

For high availability:

L4 load balancer distributes TCP connections across multiple reverse proxy instances.
Each reverse proxy in turn does HTTP-level routing/load balancing to backends.

Benefits:

Redundancy at the proxy tier.
Scaling of the reverse proxy layer itself.

4. Reverse Proxies in Microservices Environments

Patterns:

Edge reverse proxy (or API gateway):

Single public entry, routes to many microservices.

Sidecar proxies:

One proxy per service instance (used by service meshes like Istio/Envoy).
Manage service-to-service communication, metrics, mTLS.

From the server-admin perspective, the “edge proxy” is usually the focus.

Core Concepts: Headers, IPs, and Identity

Client IP and `X-Forwarded-For`

Because all traffic arrives at the backend from the reverse proxy, the backend’s direct remote_addr is the proxy IP.

To preserve the real client IP, proxies add a header:

X-Forwarded-For: client_ip, proxy1_ip, proxy2_ip, ...

Best practices:

Only trust X-Forwarded-For on internal networks or from known proxies.
Configure your web server/framework to log and use the right IP.
Some proxies now support standardized Forwarded header, but X-Forwarded-* is still common.

Original Scheme and Host

With TLS termination and host-based routing:

Backends often need to know if the original request was HTTPS.
They might need the “public” hostname to generate absolute URLs.

Common headers:

X-Forwarded-Proto: http|https
X-Forwarded-Host: original_host
X-Forwarded-Port: 80|443|...

Backends use these values for:

Correct redirect locations.
Building absolute URLs in responses.
Applying security decisions like “only allow if HTTPS”.

Path and Header Rewriting

Reverse proxies frequently modify (rewrite) requests or responses.

URI Rewriting

Typical scenarios:

Public URL: https://example.com/app/
Backend application expects to be served from / (root context).

The proxy can:

Strip /app from the path when forwarding to backend.
Optionally rewrite Location headers or HTML content in responses so that links still work from /app.

You must be careful with:

Trailing slashes and double slashes.
Relative vs absolute URLs.
Redirect loops (e.g. backend redirects to /, proxy adds /app/, etc.).

Header Injection/Filtering

The proxy can:

Add security headers:

Strict-Transport-Security
X-Content-Type-Options
Content-Security-Policy (if applied globally).

Set caching headers:

Cache-Control, Expires, ETag, Last-Modified.

Strip sensitive internal headers from backend responses before they reach the client.

This separation allows backends to focus on application logic, while the proxy enforces consistent edge policies.

Caching Behavior and Considerations

A reverse proxy acting as a cache observes HTTP caching headers:

Cache-Control
Expires
ETag
Last-Modified

Important concepts:

Shared cache: reverse proxy is shared by many users; “private” content should not be cached there.
Varying on headers:

Vary: Accept-Encoding for gzip vs plain text.
Vary: Authorization or Cookie (often mean content should not be cached in shared proxy, or should be carefully segmented).

Stale-while-revalidate / stale-if-error:

Some proxies support serving slightly outdated content while fetching new content, or serving stale content on backend errors to improve availability.

Tune caching carefully; misconfiguration can leak private data or cause users to see each other’s responses.

Security Implications and Pitfalls

Benefits

Reduced attack surface: no direct access to backends.
Centralized enforcement of:

TLS policies.
IP allow/deny lists.
Rate limiting.
WAF rules.

Easier patching: edge software and configuration can mitigate some vulnerabilities quickly (e.g. path filtering, header removal).

Common Mistakes

Incorrect trust of headers:

Treating X-Forwarded-For as authoritative even when it can be spoofed (e.g. if you accept traffic from the open internet directly).

Leaking internal details:

Misconfigured error pages exposing backend hostnames or paths.

Open proxies:

A reverse proxy misconfigured as a generic web proxy can be abused to relay traffic.

Bypass of security checks:

Separate paths or hostnames not properly filtered or authenticated.

Always combine reverse proxy rules with a clear threat model and, when possible, defense in depth on the backend.

Operational Concerns

Observability: Logs and Metrics

Reverse proxies are prime sources for:

Access logs:

Central record of all HTTP requests.
Can be fed into log aggregation / SIEM.

Error logs:

Edge-level errors (bad requests, upstream failures).

Metrics:

Request rates, latency distributions, error percentages.
Backend health statuses.

Good practice:

Log key fields: client IP, request line, status, bytes, referer, user agent, upstream time, upstream address.
Use consistent formats across proxy instances.

Configuration Management

Reverse proxy configurations can grow complex:

Many virtual hosts.
Dozens of routing rules.
Different TLS certs, ciphers, and policies.

Managing them reliably often involves:

Configuration management tools (Ansible, Puppet, Chef).
Template-driven configuration and environment-specific parameterization.
Staging environments to test config changes before production.
Automated reloads with zero downtime (supported by most modern proxies).

High Availability

To avoid a single point of failure:

Run multiple instances of your reverse proxy.
Put them behind:

A layer-4 load balancer, or
VRRP/keepalived with floating IPs, or
Cloud-native load balancers.

Also consider:

Session persistence requirements (if any) and how to manage them (covered under load balancing).
Shared configuration and certificates across instances.

When to Use a Reverse Proxy

A reverse proxy is especially appropriate when:

You have multiple backend services or applications to expose.
You need TLS termination and centralized certificate management.
You want to introduce caching, compression, and performance optimization at the edge.
You require WAF functionality and consistent security policies.
You are planning or running a microservices architecture.
You need to hide internal infrastructure structure from the public internet.

In modern server administration, a reverse proxy is almost always part of the web stack; understanding its concepts is key to designing robust, secure, and scalable services.

Comments

Please login to add a comment.

Don't have an account? Register now!