Skip to content
Node.js nd performance 5 min read

Caching Strategies

Caching trades memory for speed: instead of recomputing a result or hitting a slow database on every request, you store the answer somewhere fast and serve it again. A well-placed cache can cut response times from hundreds of milliseconds to single digits and shield downstream systems from load. The catch is that cached data goes stale, and deciding when to refresh or evict it is one of the genuinely hard problems in computing.

Caching layers

Caches exist at several levels, and a mature Node.js service usually combines a few of them. Each layer is faster but smaller and less shared than the one below it.

LayerScopeLatencySurvives restartShared across instances
In-process (Map / LRU)Single process~0.001 msNoNo
Distributed (Redis)Whole cluster~1 msYes (configurable)Yes
HTTP / CDNClient & edgevariesYesYes

The rule of thumb is to cache as close to the consumer as the data’s freshness requirements allow. Per-process caching is unbeatably fast but duplicates data and cannot be invalidated across instances; Redis is shared and durable but adds a network hop.

In-memory caching

The simplest cache is a plain Map. It works, but it grows without bound and never expires entries, which leaks memory. For anything real, bound the size with an LRU (least-recently-used) cache that evicts the oldest entries and supports a time-to-live.

import { LRUCache } from "lru-cache";

const cache = new LRUCache({
  max: 5000,            // hard cap on entries
  ttl: 1000 * 60 * 5,   // entries expire after 5 minutes
  updateAgeOnGet: true, // touching an entry refreshes its recency
});

cache.set("user:42", { id: 42, name: "Ada" });
console.log(cache.get("user:42"));
console.log(cache.get("user:99")); // not present

Output:

{ id: 42, name: 'Ada' }
undefined

In-process caches are invisible to other instances. If one server updates a record and evicts its local copy, the other servers keep serving stale data until their own TTLs expire. Use short TTLs for per-process caches, or move shared state to Redis.

Distributed caching with Redis

Redis gives every instance a single shared cache with built-in expiry. The SET command’s EX option attaches a TTL in seconds, so expired keys are reclaimed automatically without any bookkeeping on your side.

import { createClient } from "redis";

const redis = createClient({ url: process.env.REDIS_URL });
await redis.connect();

// Store JSON with a 10-minute TTL
await redis.set("product:7", JSON.stringify({ id: 7, price: 19.99 }), {
  EX: 600,
});

const raw = await redis.get("product:7");
const product = raw ? JSON.parse(raw) : null;
console.log(product);

Output:

{ id: 7, price: 19.99 }

Because Redis is a separate process (often on another host), treat every call as fallible: a cache that is down should slow you down, not take you down. Wrap reads in try/catch and fall through to the source of truth on error.

The cache-aside pattern

The most common strategy is cache-aside (also called lazy loading): the application checks the cache first, and on a miss it loads from the database, populates the cache, and returns. The cache only ever holds data that was actually requested.

async function getUser(id) {
  const key = `user:${id}`;

  const cached = await redis.get(key);
  if (cached) return JSON.parse(cached); // cache hit

  const user = await db.query("SELECT * FROM users WHERE id = $1", [id]);
  if (user) {
    await redis.set(key, JSON.stringify(user), { EX: 300 });
  }
  return user; // cache miss, now warmed
}

This keeps the cache and database loosely coupled and is resilient to cache outages. Its weakness is the first request after expiry always pays full latency, and under heavy concurrency many requests can stampede the database for the same missing key. A short lock or a “single-flight” wrapper that deduplicates concurrent loads mitigates the stampede.

HTTP caching

For responses served over HTTP, let the protocol do the work. The Cache-Control header tells browsers and CDNs how long a response may be reused, and ETag enables cheap revalidation: the client sends If-None-Match, and the server replies 304 Not Modified with an empty body when nothing changed.

import express from "express";

const app = express();

app.get("/api/config", (req, res) => {
  res.set("Cache-Control", "public, max-age=60, stale-while-revalidate=30");
  res.json({ theme: "dark", version: "18" });
});

stale-while-revalidate lets caches serve a slightly stale response instantly while fetching a fresh one in the background, which removes the latency spike at expiry.

Cache invalidation

Invalidation is the hard part. TTLs handle the easy case: data that may safely be stale for a bounded window simply expires. For data that must be fresh immediately after a write, invalidate explicitly by deleting the key when the underlying record changes.

async function updateUser(id, changes) {
  await db.update("users", id, changes);
  await redis.del(`user:${id}`); // drop the stale entry
}
StrategyWhen to useTrade-off
TTL expiryTolerable staleness windowSimple, but serves stale data until expiry
Write-through deleteMust be fresh after writesCouples writes to cache; misses on next read
Tag / prefix purgeMany keys depend on one entityNeeds key bookkeeping or SCAN
Versioned keysAvoid deletes entirelyOld versions linger until evicted

Prefer expiring keys over deleting them where correctness allows. A missed deletion serves stale data indefinitely, whereas a TTL guarantees the cache eventually heals itself.

Best Practices

  • Always set a TTL. An unbounded cache is a memory leak waiting to happen.
  • Bound in-process caches by entry count with an LRU; never let a raw Map grow forever.
  • Treat the cache as optional: on any cache error, fall back to the source of truth instead of failing the request.
  • Cache the right granularity — whole serialized objects, not chatty per-field lookups.
  • Deduplicate concurrent misses (single-flight) to avoid stampeding the database on popular keys.
  • Invalidate on write for must-be-fresh data, and lean on short TTLs everywhere else.
  • Measure your hit ratio; a cache below ~80% hits often costs more than it saves.
Last updated June 14, 2026
Was this helpful?