Performance Best Practices
Express is fast by default, but a production app lives or dies by what you put around it. Most latency and wasted CPU come from a handful of avoidable mistakes: shipping uncompressed responses, recomputing results on every request, opening a fresh database connection per query, blocking the event loop, and forcing Node to do work a reverse proxy or CDN could do for free. This page covers the practices that have the largest impact on throughput and tail latency.
Compress responses
Text-based responses (JSON, HTML, CSS, JS) compress extremely well. Enabling gzip or Brotli typically cuts payload size by 70-90%, which dominates response time for clients on slow connections. Use the compression middleware and mount it early so it wraps every downstream route.
import express from "express";
import compression from "compression";
const app = express();
// Compress all responses; skip if the client opts out via a header
app.use(
compression({
threshold: 1024, // only compress responses larger than 1 KB
filter: (req, res) => {
if (req.headers["x-no-compression"]) return false;
return compression.filter(req, res);
},
})
);
In high-traffic deployments, prefer offloading compression to your reverse proxy (Nginx/Cloudflare). Compressing in Node consumes CPU that could be serving requests, and proxies do it more efficiently.
Cache expensive work
Caching is the highest-leverage performance lever. Cache at three levels: HTTP responses (so clients and CDNs skip the round trip), computed values (so you avoid recomputing), and database reads (so you avoid the query entirely).
Set explicit cache headers on responses that are safe to reuse:
app.get("/api/products/:id", async (req, res) => {
const product = await getProduct(req.params.id);
res.set("Cache-Control", "public, max-age=300, s-maxage=600");
res.json(product);
});
For server-side caching, an in-memory store works for a single instance; use Redis when you run multiple instances so the cache is shared.
import { createClient } from "redis";
const redis = createClient();
await redis.connect();
async function getProductCached(id) {
const key = `product:${id}`;
const hit = await redis.get(key);
if (hit) return JSON.parse(hit);
const product = await db.products.findById(id);
await redis.set(key, JSON.stringify(product), { EX: 300 });
return product;
}
| Cache layer | Stores | Best for | Tool |
|---|---|---|---|
| Client / CDN | HTTP responses | Public, cacheable GETs | Cache-Control, CDN |
| Application | Computed values | Hot paths, shared state | Redis, lru-cache |
| Database | Query results | Repeated reads | Redis, query cache |
Pool database connections
Opening a TCP connection and authenticating per query is slow and exhausts the database’s connection limit under load. Create one pool at startup and reuse it for every request.
import pg from "pg";
const pool = new pg.Pool({
host: process.env.DB_HOST,
max: 20, // tune to your DB's connection limit
idleTimeoutMillis: 30_000,
connectionTimeoutMillis: 2_000,
});
app.get("/api/users", async (req, res, next) => {
try {
const { rows } = await pool.query("SELECT id, name FROM users LIMIT 100");
res.json(rows);
} catch (err) {
next(err);
}
});
Keep the event loop free
Node runs your JavaScript on a single thread. Any synchronous work — large JSON parsing, crypto, image processing, or a tight loop — blocks every other request until it finishes. Keep handlers async and never call the *Sync variants of fs or crypto in a request path.
import { readFile } from "node:fs/promises";
// Good: non-blocking
app.get("/config", async (req, res, next) => {
try {
const data = await readFile("./config.json", "utf8");
res.type("json").send(data);
} catch (err) {
next(err);
}
});
For genuinely CPU-heavy tasks (PDF rendering, image resizing, hashing), move them off the main thread with a worker pool or a background job queue. Express 5 helps here too: a rejected promise returned from an async handler is forwarded to your error middleware automatically, so you no longer need to wrap every handler in try/catch just to call next(err).
Scale across cores with clustering
A single Node process uses one CPU core. To use all cores, run multiple instances behind a load balancer. In production, prefer a process manager such as PM2 over the raw cluster module — it handles restarts, zero-downtime reloads, and log aggregation.
# Launch one worker per CPU core, restart on crash
pm2 start app.js -i max --name api
pm2 reload api # zero-downtime redeploy
Output:
[PM2] Starting /srv/app/app.js in cluster_mode (8 instances)
[PM2] Done.
┌────┬──────┬─────────┬───────────┬──────┬───────────┐
│ id │ name │ mode │ status │ cpu │ memory │
├────┼──────┼─────────┼───────────┼──────┼───────────┤
│ 0 │ api │ cluster │ online │ 0% │ 48.2 MB │
│ 1 │ api │ cluster │ online │ 0% │ 47.9 MB │
└────┴──────┴─────────┴───────────┴──────┴───────────┘
Each cluster worker has its own memory, so in-process caches and rate-limit counters are not shared. Move that state into Redis when you scale beyond one process.
Offload static assets and TLS
Node is not the best tool for serving images or terminating TLS. Put a reverse proxy (Nginx) or CDN in front of your app to serve static files, terminate HTTPS, and cache responses at the edge. This frees the event loop to handle dynamic requests. If Express must serve static files, let express.static send strong cache headers and trust the proxy:
app.set("trust proxy", 1); // honor X-Forwarded-* from the proxy
app.use(
express.static("public", {
maxAge: "1y",
immutable: true,
etag: true,
})
);
Best Practices
- Mount
compressionearly, or offload it to a reverse proxy in high-traffic systems. - Add
Cache-Controlheaders to cacheable GETs and use Redis for shared server-side caching. - Create one connection pool at startup; never open a connection per request.
- Keep handlers
asyncand move CPU-bound work to workers or a job queue so the event loop stays free. - Run one process per core with PM2 (
-i max) and store shared state (cache, sessions, rate limits) in Redis. - Serve static files and terminate TLS at a CDN or proxy, and set
trust proxyso client IPs and protocols are correct. - Always measure before and after with a load test (
autocannon,k6) — optimize the slow path, not a guess.