Welcome to our deep dive into NGINX caching. In the landscape of web performance, NGINX stands out as a versatile, high-performance tool. Originally created to solve the "C10k problem"—handling ten thousand concurrent connections on a single server—NGINX has evolved from a simple web server into a multi-purpose tool: a reverse proxy, a load balancer, and, for our focus today, a highly effective caching server. Its event-driven, asynchronous architecture makes it incredibly efficient with system resources, particularly CPU and memory. For context, a minimal NGINX caching setup for a small on-premise environment can comfortably run on a single CPU core with just 512 MB of RAM (Jainandunsing, 2025).
It's crucial to distinguish NGINX's role from other caching solutions you may have studied, such as Redis or Memcached. While Redis and Memcached are in-memory key-value stores, excelling at caching application data, database query results, and user session objects, NGINX operates at the HTTP level. As a reverse proxy, it sits between the client (the user's browser) and your backend application servers. This strategic position allows it to intercept HTTP requests and serve responses directly from its cache without ever needing to contact the backend server, dramatically reducing response times and server load. Jainandunsing (2025) notes that while NGINX doesn't directly cache user sessions like Redis, it excels at caching HTTP session-related content, such as authenticated pages and API responses, by inspecting headers or cookies.
NGINX's caching capabilities are primarily provided by the `ngx_http_proxy_module`. To enable caching, you must understand a set of core directives that define how the cache operates. Let's dissect the most important ones.
proxy_cache_pathThis is the foundational directive; it defines the cache itself. It must be declared in the `http` context of your NGINX configuration (i.e., outside of any `server` or `location` block). It configures the physical path on the disk where cached files will be stored and sets up a shared memory zone to hold the cache keys and metadata.
A typical `proxy_cache_path` declaration looks like this:
proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=my_cache:10m max_size=10g inactive=60m use_temp_path=off;
Let's break down its parameters:
/var/cache/nginx: This is the local filesystem path for storing cached content. You must ensure this directory exists and that the NGINX worker process user (commonly `www-data`) has read and write permissions.levels=1:2: This is a critical performance parameter. It defines the directory structure for the cache. NGINX hashes the cache key (by default, the request string) to create a filename. The `levels` parameter tells NGINX to create a tiered directory structure based on that hash. For `levels=1:2`, NGINX will take the last character of the hash for the top-level directory name and the next two characters for the subdirectory name. For example, a file with a hash ending in `...def` would be stored at `/var/cache/nginx/f/de/filename`. This prevents having millions of files in a single directory, which can severely degrade filesystem performance.keys_zone=my_cache:10m: This creates a shared memory zone named `my_cache` with a size of 10 megabytes. This zone is used to store all active cache keys and metadata about the cached data. It does not store the response data itself. The size is crucial; if it's too small, NGINX will start removing older cache entries to make space for new ones, even if they haven't expired, a process known as cache eviction. A 1MB zone can hold approximately 8,000 keys, so a 10MB zone can hold roughly 80,000 keys. You must size this based on your expected number of cached items.max_size=10g: This sets the upper limit on the size of the on-disk cache. In this case, 10 gigabytes. If the cache size exceeds this limit, a special process called the "cache manager" will activate to remove the least recently used items to bring the size back under the limit.inactive=60m: This specifies how long an item can remain in the cache without being accessed. If a cached item isn't requested for 60 minutes, it will be removed by the cache manager, regardless of whether it has expired according to its `Cache-Control` header. This is useful for clearing out unpopular content from the cache.use_temp_path=off: An important optimization. By default, NGINX first writes responses to a temporary file location and then moves them into the cache directory. Setting this to `off` tells NGINX to write files directly into the cache directory, avoiding an unnecessary disk I/O operation (copying the file) and improving performance.proxy_cacheThis directive, used within a `server` or `location` block, enables caching and specifies which cache zone to use. The name must match the name defined in `keys_zone` in the `proxy_cache_path` directive.
proxy_cache my_cache;
Once this is set, NGINX will start caching eligible responses for requests matching this location.
proxy_cache_validThis directive sets the default caching time for different HTTP response codes. You can have multiple instances of this directive.
proxy_cache_valid 200 302 60m;
proxy_cache_valid 404 1m;
In this example, responses with a `200 OK` or `302 Found` status will be cached for 60 minutes. `404 Not Found` responses will be cached for 1 minute to prevent repeatedly hitting the backend for non-existent resources. This overrides any `Expires` or `Cache-Control` headers sent from the backend server unless you configure NGINX to respect them.
proxy_cache_bypass and proxy_no_cacheThese directives give you fine-grained control over when to skip caching. They both take one or more string parameters. If any parameter is not empty and not "0", the condition is met.
proxy_cache_bypass: If the condition is met, NGINX will go directly to the backend server for a response, effectively treating it as a `MISS`. However, if the response is cacheable, it will still be saved to the cache for subsequent requests. This is useful for "refresh" actions.proxy_no_cache: If the condition is met, NGINX will not save the response to the cache, even if it's fetched from the backend and would otherwise be considered cacheable.A common use case is to bypass the cache for logged-in users, identified by a session cookie:
proxy_cache_bypass $cookie_sessionid;
proxy_no_cache $cookie_sessionid;
Here, if the `sessionid` cookie is present in the request, NGINX will bypass the cache to get a fresh response from the backend (`proxy_cache_bypass`), and it will not save that personalized response to the cache (`proxy_no_cache`).
add_header X-Cache-StatusThis is not a caching directive per se, but it is indispensable for debugging. It adds a custom header to the response sent to the client, indicating the cache status.
add_header X-Cache-Status $upstream_cache_status;
The `$upstream_cache_status` variable can have several values:
HIT: The response was served directly from the cache. This is what you want to see!MISS: The response was not in the cache and was fetched from the backend server. The response may now be cached.EXPIRED: The entry in the cache had expired and was re-fetched from the backend.BYPASS: The response was fetched from the backend because the request matched a `proxy_cache_bypass` directive.STALE: The response was served from a stale cache entry due to `proxy_cache_use_stale` being active.Let's put theory into practice. We'll set up a basic NGINX caching reverse proxy on an Ubuntu/Debian system.
sudo apt update
sudo apt install nginx -y
sudo mkdir -p /var/cache/nginx
sudo chown -R www-data:www-data /var/cache/nginx
Failure to set the correct permissions is one of the most common setup errors.
sudo nano /etc/nginx/nginx.conf
Inside the `http` block, but outside any `server` blocks, add:
http {
# ... other http settings ...
proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=my_cache:10m max_size=1g inactive=60m use_temp_path=off;
# ... include sites-enabled/*; ...
}
Next, configure your specific site to use the cache. Edit your site's configuration file:
sudo nano /etc/nginx/sites-available/default
Modify the `server` block to act as a caching proxy. Assume your backend application is running on `http://127.0.0.1:8080`.
server {
listen 80;
server_name your_domain.com;
location / {
proxy_pass http://127.0.0.1:8080;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# Caching directives
proxy_cache my_cache;
proxy_cache_valid 200 302 10m;
proxy_cache_valid 404 1m;
# Bypass for logged-in users
proxy_cache_bypass $cookie_sessionid;
proxy_no_cache $cookie_sessionid;
# Add cache status header for debugging
add_header X-Cache-Status $upstream_cache_status;
}
}
sudo nginx -t
If it reports success, restart NGINX to apply the changes.
sudo systemctl restart nginx
curl -I http://your_domain.com/some-page
The first time you run this command, you should see `X-Cache-Status: MISS`. Run it a second time, and you should see `X-Cache-Status: HIT`. Success! Your cache is working.
Here is a consolidated view of the necessary configuration snippets for a simple caching setup.
In `/etc/nginx/nginx.conf` (inside the `http` block):
# Defines the cache storage path, memory zone, size, and other parameters.
proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=my_cache:10m max_size=1g inactive=60m use_temp_path=off;
In `/etc/nginx/sites-available/default` (or your site's config):
server {
listen 80;
server_name example.com;
location / {
# Backend application server address
proxy_pass http://127.0.0.1:8080;
# Pass essential headers to the backend
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
# Enable caching using the 'my_cache' zone
proxy_cache my_cache;
# Define cache validity for different response codes
proxy_cache_valid 200 302 10m; # Cache successful responses for 10 minutes
proxy_cache_valid any 1m; # Cache other responses (like 404s) for 1 minute
# Add a header to see the cache status (HIT, MISS, etc.)
add_header X-Cache-Status $upstream_cache_status;
}
}
Verification Commands:
# First request - should be a MISS
$ curl -I http://example.com/
HTTP/1.1 200 OK
Server: nginx/1.18.0 (Ubuntu)
...
X-Cache-Status: MISS
...
# Second request - should be a HIT
$ curl -I http://example.com/
HTTP/1.1 200 OK
Server: nginx/1.18.0 (Ubuntu)
...
X-Cache-Status: HIT
...
NGINX was created by a Russian software engineer, Igor Sysoev, in the early 2000s. He was working at Rambler, a popular Russian web portal, and was frustrated with the performance of existing web servers like Apache under heavy load. The primary challenge he aimed to solve was the "C10k problem"—how to architect a server to handle ten thousand concurrent client connections. NGINX's event-driven, non-blocking architecture was his solution, and it proved so effective that it was open-sourced in 2004 and quickly became a cornerstone of high-performance web infrastructure worldwide.
proxy_cache_path directive is the foundation, defining the cache's location, memory zone, and size limits. It is configured in the http context.proxy_cache directive enables caching within a specific `server` or `location` block, referencing the zone defined by `proxy_cache_path`.proxy_cache_valid to set default cache durations for different HTTP status codes.add_header X-Cache-Status $upstream_cache_status; directive is essential for verifying and debugging your cache's behavior.As your NGINX instance becomes a critical part of your infrastructure, serving content to users, its security becomes paramount. A compromised caching server can lead to several severe issues, including session hijacking, data leakage between users, and cache poisoning attacks, where an attacker injects malicious content into your cache, which is then served to legitimate users. Furthermore, since the cache often sits at the edge of your network, it's a prime target for attackers. Therefore, securing it involves multiple layers: encrypting data in transit, hardening the server configuration, controlling network access, and carefully managing how authenticated or personalized content is handled.
The first and most fundamental step in securing your web traffic is encrypting it with SSL/TLS, enabling HTTPS. In the past, this was a costly and complex process involving purchasing certificates from a Certificate Authority (CA). Today, thanks to Let's Encrypt, it's free and automated. We'll use the `certbot` tool, which automates the process of obtaining, installing, and renewing Let's Encrypt certificates.
Step-by-Step SSL/TLS Setup with Certbot:
sudo snap install core; sudo snap refresh core
sudo snap install --classic certbot
sudo ln -s /snap/bin/certbot /usr/bin/certbot
Next, install the Certbot NGINX plugin, which allows Certbot to automatically read and modify your NGINX configuration to set up HTTPS.
sudo apt install python3-certbot-nginx
sudo certbot --nginx
The tool will guide you through a few prompts:
sudo certbot renew --dry-run
After Certbot completes, it will have modified your site's NGINX configuration file. It will add a new `server` block for listening on port 443 (HTTPS) and will include `ssl_certificate` and `ssl_certificate_key` directives pointing to the newly obtained certificate files. It will also handle the redirect from port 80 if you selected that option.
While Certbot provides a good default configuration, you can further harden your server's security by specifying stronger protocols, ciphers, and enabling important security headers. You can create a separate snippet file for these settings to keep your main configuration clean.
Create a file, e.g., /etc/nginx/snippets/ssl-params.conf, and add the following:
# Use modern TLS protocols
ssl_protocols TLSv1.2 TLSv1.3;
# Use a strong set of cipher suites
ssl_ciphers 'ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384';
ssl_prefer_server_ciphers off;
# Enable HSTS (HTTP Strict Transport Security)
# Tells browsers to only connect via HTTPS for the next 6 months.
add_header Strict-Transport-Security "max-age=15768000; includeSubDomains; preload" always;
# Other security headers
add_header X-Frame-Options "SAMEORIGIN" always;
add_header X-Content-Type-Options "nosniff" always;
add_header X-XSS-Protection "1; mode=block" always;
Then, include this snippet in your `server` block for port 443:
server {
listen 443 ssl http2;
server_name your_domain.com;
# ... ssl_certificate and ssl_certificate_key from certbot ...
include /etc/nginx/snippets/ssl-params.conf;
# ... rest of your server configuration ...
}
A firewall is a critical security layer that controls network traffic to and from your server. We'll use UFW (Uncomplicated Firewall), a user-friendly frontend for `iptables`.
sudo ufw default deny incoming
sudo ufw default allow outgoing
sudo ufw allow ssh
sudo ufw allow 'Nginx Full'
The `'Nginx Full'` profile allows traffic on both port 80 (HTTP) and 443 (HTTPS).
sudo ufw enable
Confirm with 'y' when prompted. You can check the status at any time with `sudo ufw status`.
Caching content for logged-in users is powerful but fraught with risk. If you cache a page containing personal user information and serve it to another user, you have a major data breach. The key to doing this safely is to ensure that the cache key is unique for each user.
By default, NGINX's cache key is based on variables like `$scheme`, `$proxy_host`, and `$request_uri`. For a public page, this is fine. But for an authenticated page, this key is the same for all users, which is dangerous.
We can customize the cache key using the proxy_cache_key directive to include a user-specific identifier, such as a session cookie.
# Customize the cache key to include the session ID cookie
proxy_cache_key "$scheme$host$request_uri$cookie_sessionid";
With this directive, a request for `/my-account` from a user with `sessionid=abc` will generate a different cache entry than a request from a user with `sessionid=xyz`. This effectively creates a private cache for each user session. While this prevents data leakage, be aware of the implications: you will now store a separate copy of the page for every user who visits it, which can rapidly consume cache storage.
For highly sensitive pages (e.g., "edit payment info"), it's often best to bypass the cache entirely using the techniques we discussed in Section 1:
location /account/billing {
# This content is too sensitive to cache
proxy_cache_bypass 1;
proxy_no_cache 1;
proxy_pass http://127.0.0.1:8080;
# ... other proxy settings ...
}
A balanced approach is to use a unique cache key for personalized but non-sensitive content (like a user's dashboard) and completely bypass the cache for critical sections.
This example combines an SSL configuration from Certbot, our hardened SSL parameters, and a location block that uses a custom cache key for authenticated content.
# Redirect HTTP to HTTPS
server {
listen 80;
server_name example.com www.example.com;
return 301 https://$host$request_uri;
}
server {
listen 443 ssl http2;
server_name example.com www.example.com;
# SSL Certificate paths provided by Certbot
ssl_certificate /etc/letsencrypt/live/example.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/example.com/privkey.pem;
# Include hardened SSL/TLS parameters and security headers
include /etc/nginx/snippets/ssl-params.conf;
root /var/www/html;
index index.html;
location / {
# General caching for anonymous users
proxy_pass http://127.0.0.1:8080;
proxy_cache my_cache;
proxy_cache_valid 200 10m;
# Bypass for authenticated users (identified by sessionid cookie)
proxy_cache_bypass $cookie_sessionid;
proxy_no_cache $cookie_sessionid;
add_header X-Cache-Status $upstream_cache_status;
}
location /dashboard {
# Cache this section per-user
proxy_pass http://127.0.0.1:8080;
proxy_cache my_cache;
proxy_cache_valid 200 5m;
# This is the crucial part for per-user caching
proxy_cache_key "$scheme$host$request_uri$cookie_sessionid";
add_header X-Cache-Status $upstream_cache_status;
}
}
Let's Encrypt, the free Certificate Authority, was founded in 2014 by the Internet Security Research Group (ISRG) with a mission to encrypt the entire web. Before its launch, obtaining an SSL/TLS certificate was often a manual, expensive process, creating a barrier for many website owners. By providing free, automated certificates, Let's Encrypt has been a major driving force behind the massive increase in HTTPS adoption, with over 300 million active certificates, making the web a significantly safer place for everyone.
Having a functional cache is the first step; making it exceptionally fast and resilient is the next. Performance optimization in NGINX caching isn't about a single magic setting. It's about understanding the interplay between memory, disk I/O, and network traffic, and then tuning specific directives to best suit your application's access patterns. Key performance indicators (KPIs) to monitor are the cache hit ratio (the percentage of requests served from the cache), the server's response time (latency), and resource utilization (CPU, memory, disk I/O). Our goal is to maximize the hit ratio while minimizing latency and resource usage.
We introduced `proxy_cache_path` in Section 1, but its parameters have a profound impact on performance that warrants a deeper look.
proxy_cache_path /data/nginx_cache levels=1:2 keys_zone=my_cache:100m max_size=50g inactive=24h use_temp_path=off;
keys_zone Sizing: Sizing this correctly is paramount. The shared memory zone holds the metadata for every item in the cache. If this zone fills up, NGINX will be forced to evict items from the cache using an LRU (Least Recently Used) algorithm, even if the items are not expired and you have plenty of disk space left in `max_size`. This can tank your hit ratio. As a rule of thumb, budget 1MB for every 8,000 keys. If you anticipate caching 500,000 different objects, you would need approximately 500,000 / 8,000 = 62.5, so a `keys_zone` of `64m` or more would be appropriate. Always monitor your error logs for warnings about the cache zone being full.loader_files, loader_threshold, loader_sleep: These are less common but powerful parameters for managing cache re-population after a server restart. When NGINX starts, it needs to load the cache key metadata from disk back into the `keys_zone`. For a very large cache, this can be an I/O-intensive process that slows down startup. These directives allow you to "throttle" this loading process. For example, `loader_files=500 loader_threshold=300ms loader_sleep=50ms` tells the "cache loader" process to iterate through the cache, loading 500 files at a time. If an iteration takes longer than 300ms, it will pause for 50ms before continuing, preventing it from monopolizing disk I/O.Once your cache is tuned, you can implement more sophisticated strategies to improve resilience and performance under heavy load.
What happens when your backend server goes down? By default, NGINX will return an error (e.g., `502 Bad Gateway`) to the user. However, with the proxy_cache_use_stale directive, you can configure NGINX to serve an expired (stale) version of the content from its cache if it's unable to get a fresh version from the backend. This is a massive win for user experience and site availability.
proxy_cache_use_stale error timeout invalid_header updating http_500 http_502 http_503 http_504;
This configuration tells NGINX to use a stale item if it encounters a communication error with the backend, a timeout, or receives one of the specified 5xx error codes. The `updating` parameter is particularly interesting: if a client requests an expired item, NGINX can serve the stale version to that client immediately while it sends a single request to the backend to update the cache in the background. Subsequent requests will then receive the fresh content once it's available.
Consider a scenario where a very popular but uncached page is requested by thousands of users at once (e.g., after the cache expires). This is known as the "thundering herd" or "cache stampede" problem. All thousands of requests will result in a `MISS`, and NGINX will forward all of them to your backend server simultaneously, potentially overwhelming it.
The proxy_cache_lock directive solves this elegantly. When enabled, if multiple clients request a file that is not in the cache, only the first request is allowed through to the backend server. The other requests are held (they "wait") until that first request has populated the cache. Once the item is in the cache, the waiting requests are all served from the cache. This ensures only one request hits your backend for any given resource at a time.
location /popular-articles/ {
proxy_pass http://backend;
proxy_cache my_cache;
# Enable cache locking
proxy_cache_lock on;
# Set a timeout for how long a request will wait for the lock to be released
proxy_cache_lock_timeout 5s;
}
proxy_cache_lock_timeout is a safety net. If the first request doesn't populate the cache within this time, the waiting requests will be passed to the backend server to prevent them from timing out.
Not all content is created equal. Small, frequently accessed HTML files have very different caching characteristics from large, infrequently accessed video files. You can create multiple cache zones to handle them differently. For instance, you could have one small, fast cache on an SSD for API responses and HTML, and another larger cache on a traditional HDD for media assets.
# In http block
proxy_cache_path /mnt/ssd/nginx_cache levels=1:2 keys_zone=fast_cache:50m max_size=5g;
proxy_cache_path /mnt/hdd/nginx_cache levels=1:2 keys_zone=large_cache:20m max_size=500g inactive=30d;
# In server block
server {
# ...
location ~ \.(html|json)$ {
proxy_pass http://backend;
proxy_cache fast_cache;
proxy_cache_valid 200 5m;
}
location ~ \.(mp4|zip|iso)$ {
proxy_pass http://backend;
proxy_cache large_cache;
proxy_cache_valid 200 7d;
}
}
You cannot optimize what you cannot measure. Enhancing your logs to include cache status is the first step. You can define a custom log format that includes the `$upstream_cache_status` variable.
In your `http` block in `nginx.conf`:
log_format cache_log '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for" '
'Cache-Status: $upstream_cache_status';
Then, use this format in your `server` block's `access_log` directive:
access_log /var/log/nginx/access.log cache_log;
With this in place, you can easily analyze your logs with tools like `grep`, `awk`, or dedicated log analysis software to calculate your cache hit ratio. For real-time monitoring, tools like Netdata or a combination of Prometheus with the `nginx-prometheus-exporter` can provide detailed dashboards showing hit/miss ratios, cache sizes, and other critical metrics.
This configuration snippet demonstrates a combination of advanced techniques: serving stale content, cache locking, and a custom log format for monitoring.
In `/etc/nginx/nginx.conf` (inside the `http` block):
# Define a custom log format that includes the cache status
log_format cache_log_format '$remote_addr - [$time_local] "$request" $status '
'($body_bytes_sent) "$http_referer" "$http_user_agent" '
'Cache: $upstream_cache_status';
# Define the cache path and parameters
proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=my_cache:10m max_size=10g inactive=60m;
In `/etc/nginx/sites-available/default`:
server {
listen 80;
server_name example.com;
# Use the custom log format
access_log /var/log/nginx/access.log cache_log_format;
location / {
proxy_pass http://127.0.0.1:8080;
proxy_cache my_cache;
# Performance and Resilience Settings
proxy_cache_lock on; # Prevent thundering herd
proxy_cache_lock_timeout 5s;
# Serve stale content if backend is down or slow
proxy_cache_use_stale error timeout updating http_500 http_502 http_503 http_504;
# Update cache in the background while serving stale content
proxy_background_update on;
proxy_cache_valid 200 302 10m;
proxy_cache_valid 404 1m;
add_header X-Cache-Status $upstream_cache_status;
}
}
The "thundering herd problem" is not unique to web caches. It's a classic computer science problem that occurs in any system where multiple processes or threads wait for an event. When the event occurs, they all "wake up" and stampede towards the same resource, overwhelming it. This can happen with database connections, file locks, and network sockets. Cache locking in NGINX is a specific and highly effective implementation of a general solution pattern known as "request coalescing" or "request collapsing."
Jainandunsing, K. (2025). Caching Servers Hardware Requirements & Software Configurations. (Version 1.0). [Internal Course Document].
NGINX, Inc. (2023). NGINX Docs: Module ngx_http_proxy_module. Retrieved from https://nginx.org/en/docs/http/ngx_http_proxy_module.html
Sysoev, I. (2004). [nginx-devel] Nginx-0.1.0. Mail-Archive.com. Retrieved from https://www.mail-archive.com/nginx-devel@sysoev.ru/msg00000.html