Caching Strategies
Caching trades memory for speed: instead of recomputing a result or hitting a slow database on every request, you store the answer somewhere fast and serve it again. A well-placed cache can cut response times from hundreds of milliseconds to single digits and shield downstream systems from load. The catch is that cached data goes stale, and deciding when to refresh or evict it is one of the genuinely hard problems in computing.
Caching layers
Caches exist at several levels, and a mature Node.js service usually combines a few of them. Each layer is faster but smaller and less shared than the one below it.
| Layer | Scope | Latency | Survives restart | Shared across instances |
|---|---|---|---|---|
| In-process (Map / LRU) | Single process | ~0.001 ms | No | No |
| Distributed (Redis) | Whole cluster | ~1 ms | Yes (configurable) | Yes |
| HTTP / CDN | Client & edge | varies | Yes | Yes |
The rule of thumb is to cache as close to the consumer as the data’s freshness requirements allow. Per-process caching is unbeatably fast but duplicates data and cannot be invalidated across instances; Redis is shared and durable but adds a network hop.
In-memory caching
The simplest cache is a plain Map. It works, but it grows without bound and never expires entries, which leaks memory. For anything real, bound the size with an LRU (least-recently-used) cache that evicts the oldest entries and supports a time-to-live.
import { LRUCache } from "lru-cache";
const cache = new LRUCache({
max: 5000, // hard cap on entries
ttl: 1000 * 60 * 5, // entries expire after 5 minutes
updateAgeOnGet: true, // touching an entry refreshes its recency
});
cache.set("user:42", { id: 42, name: "Ada" });
console.log(cache.get("user:42"));
console.log(cache.get("user:99")); // not present
Output:
{ id: 42, name: 'Ada' }
undefined
In-process caches are invisible to other instances. If one server updates a record and evicts its local copy, the other servers keep serving stale data until their own TTLs expire. Use short TTLs for per-process caches, or move shared state to Redis.
Distributed caching with Redis
Redis gives every instance a single shared cache with built-in expiry. The SET command’s EX option attaches a TTL in seconds, so expired keys are reclaimed automatically without any bookkeeping on your side.
import { createClient } from "redis";
const redis = createClient({ url: process.env.REDIS_URL });
await redis.connect();
// Store JSON with a 10-minute TTL
await redis.set("product:7", JSON.stringify({ id: 7, price: 19.99 }), {
EX: 600,
});
const raw = await redis.get("product:7");
const product = raw ? JSON.parse(raw) : null;
console.log(product);
Output:
{ id: 7, price: 19.99 }
Because Redis is a separate process (often on another host), treat every call as fallible: a cache that is down should slow you down, not take you down. Wrap reads in try/catch and fall through to the source of truth on error.
The cache-aside pattern
The most common strategy is cache-aside (also called lazy loading): the application checks the cache first, and on a miss it loads from the database, populates the cache, and returns. The cache only ever holds data that was actually requested.
async function getUser(id) {
const key = `user:${id}`;
const cached = await redis.get(key);
if (cached) return JSON.parse(cached); // cache hit
const user = await db.query("SELECT * FROM users WHERE id = $1", [id]);
if (user) {
await redis.set(key, JSON.stringify(user), { EX: 300 });
}
return user; // cache miss, now warmed
}
This keeps the cache and database loosely coupled and is resilient to cache outages. Its weakness is the first request after expiry always pays full latency, and under heavy concurrency many requests can stampede the database for the same missing key. A short lock or a “single-flight” wrapper that deduplicates concurrent loads mitigates the stampede.
HTTP caching
For responses served over HTTP, let the protocol do the work. The Cache-Control header tells browsers and CDNs how long a response may be reused, and ETag enables cheap revalidation: the client sends If-None-Match, and the server replies 304 Not Modified with an empty body when nothing changed.
import express from "express";
const app = express();
app.get("/api/config", (req, res) => {
res.set("Cache-Control", "public, max-age=60, stale-while-revalidate=30");
res.json({ theme: "dark", version: "18" });
});
stale-while-revalidate lets caches serve a slightly stale response instantly while fetching a fresh one in the background, which removes the latency spike at expiry.
Cache invalidation
Invalidation is the hard part. TTLs handle the easy case: data that may safely be stale for a bounded window simply expires. For data that must be fresh immediately after a write, invalidate explicitly by deleting the key when the underlying record changes.
async function updateUser(id, changes) {
await db.update("users", id, changes);
await redis.del(`user:${id}`); // drop the stale entry
}
| Strategy | When to use | Trade-off |
|---|---|---|
| TTL expiry | Tolerable staleness window | Simple, but serves stale data until expiry |
| Write-through delete | Must be fresh after writes | Couples writes to cache; misses on next read |
| Tag / prefix purge | Many keys depend on one entity | Needs key bookkeeping or SCAN |
| Versioned keys | Avoid deletes entirely | Old versions linger until evicted |
Prefer expiring keys over deleting them where correctness allows. A missed deletion serves stale data indefinitely, whereas a TTL guarantees the cache eventually heals itself.
Best Practices
- Always set a TTL. An unbounded cache is a memory leak waiting to happen.
- Bound in-process caches by entry count with an LRU; never let a raw
Mapgrow forever. - Treat the cache as optional: on any cache error, fall back to the source of truth instead of failing the request.
- Cache the right granularity — whole serialized objects, not chatty per-field lookups.
- Deduplicate concurrent misses (single-flight) to avoid stampeding the database on popular keys.
- Invalidate on write for must-be-fresh data, and lean on short TTLs everywhere else.
- Measure your hit ratio; a cache below ~80% hits often costs more than it saves.