Response Compression
Express sends responses uncompressed by default, which means every byte of JSON, HTML, or CSS travels over the wire exactly as your handler produced it. Text payloads compress extremely well — gzip routinely shrinks them by 60-80% — so enabling compression is one of the highest-leverage, lowest-effort wins for any text-heavy API. The trade-off is CPU: compressing a response costs a small amount of work per request, so the goal is to compress the payloads that benefit while skipping the ones that don’t. This page covers the compression middleware, when compression pays off, how to tune the threshold and algorithm, and why you often want to push the work onto a reverse proxy or CDN instead.
Adding the compression middleware
Express does not ship compression in core, so install the official compression middleware. Mount it early — before your routes and before any middleware that writes a response — so it can wrap the response stream.
npm install compression
const express = require('express');
const compression = require('compression');
const app = express();
app.use(compression()); // compress responses above the default 1 KB threshold
app.get('/users', async (req, res) => {
const users = await db.query('SELECT id, name, email FROM users');
res.json(users); // large JSON arrays compress dramatically
});
app.listen(3000);
The middleware inspects the client’s Accept-Encoding header on each request, picks a supported algorithm, compresses the response body as it streams out, and sets the Content-Encoding and Vary: Accept-Encoding response headers for you. Clients that don’t advertise support simply receive the uncompressed body.
Output:
$ curl -s -H "Accept-Encoding: gzip" -D - http://localhost:3000/users -o /dev/null
HTTP/1.1 200 OK
Content-Type: application/json; charset=utf-8
Content-Encoding: gzip
Vary: Accept-Encoding
Transfer-Encoding: chunked
Tip: In Express 5 the middleware works the same way, but make sure compression is mounted before
res.json/res.sendis called downstream. If another middleware has already started writing the response, compression can’t wrap it.
When to compress
Compression helps text and hurts already-compressed binary. JSON, HTML, XML, CSS, JavaScript, and SVG shrink a lot; JPEG, PNG, WebP, MP4, and ZIP files are already compressed, so running them through gzip burns CPU for a payload that gets bigger. The middleware uses the response Content-Type and size to decide, and you can refine that with a filter function.
app.use(compression({
filter: (req, res) => {
// honor an explicit opt-out header
if (req.headers['x-no-compression']) return false;
// fall back to the library's default content-type check
return compression.filter(req, res);
}
}));
Tiny responses aren’t worth compressing either — the gzip header overhead and CPU cost outweigh the few bytes saved. That’s what the threshold controls.
Tuning the threshold and level
The two options you’ll reach for most are threshold (the minimum response size, in bytes, before compression kicks in) and level (the gzip effort, 0-9, trading CPU for ratio).
| Option | Default | Meaning | When to change |
|---|---|---|---|
threshold | 1024 | Skip bodies smaller than this many bytes | Raise it to avoid compressing small payloads; lower for mostly-small JSON |
level | -1 (zlib default, ~6) | Compression effort, 0-9 | Lower (1-3) for high-throughput APIs; higher only if CPU is idle |
memLevel | 8 | Memory used by zlib, 1-9 | Rarely changed |
filter | content-type check | Decide per-response whether to compress | Skip binary or opt-out responses |
chunkSize | 16384 | zlib output chunk size | Rarely changed |
app.use(compression({
threshold: 1024, // don't bother with sub-1 KB responses
level: 6 // balanced CPU vs. ratio; drop to 1-3 under heavy load
}));
Higher levels give diminishing returns: jumping from level 6 to level 9 might shave a few extra percent off the body while costing noticeably more CPU per request. For a busy API, a lower level often gives better overall latency because each request spends less time compressing.
What about Brotli?
Brotli typically beats gzip by another 10-20% on text and is supported by every modern browser via Accept-Encoding: br. The compression middleware itself emits gzip/deflate, not Brotli, so if you want Brotli the cleanest path is to let a reverse proxy or CDN handle it. Node does expose Brotli through zlib (zlib.brotliCompressSync, createBrotliCompress) if you need to compress a specific payload manually, but for general response compression a proxy is the right layer.
Offloading to a reverse proxy or CDN
Compressing in Node consumes event-loop and CPU time that could be serving requests. In production it’s common — and usually preferable — to terminate compression at the edge: Nginx, a CDN like Cloudflare or Fastly, or a cloud load balancer compresses responses (including Brotli) using a dedicated, highly optimized implementation, and caches the compressed result so it isn’t recomputed on every hit.
# nginx.conf — compress text responses before they reach the client
gzip on;
gzip_types text/plain application/json application/javascript text/css image/svg+xml;
gzip_min_length 1024;
gzip_comp_level 5;
# Brotli (requires the ngx_brotli module)
brotli on;
brotli_types application/json text/css application/javascript;
When a proxy or CDN is doing compression, disable the compression middleware in Express so you don’t pay the cost twice or double-compress. Keep the Node middleware only when you have no proxy in front of the app (for example, a service called directly by other internal services).
Warning: Compressing responses that reflect secret data alongside attacker-controlled input can expose you to BREACH-style side-channel attacks. For sensitive endpoints (auth tokens, CSRF-protected forms), consider disabling compression or separating secrets from user-supplied content.
Best Practices
- Mount
compression()early, before your routes and any middleware that writes the response body. - Keep the default
threshold(around 1 KB) so you don’t waste CPU compressing tiny payloads. - Don’t compress already-compressed binary — images, video, and archives — using a
filteror relying on the content-type default. - Prefer a lower compression
levelfor high-throughput APIs; the ratio gain from level 9 rarely justifies the extra CPU. - Offload compression (and Brotli) to Nginx, a load balancer, or a CDN in production, and turn off the Node middleware when you do.
- Be mindful of BREACH-style risks: avoid compressing responses that mix secrets with reflected user input.
- Measure before and after with
curl -H "Accept-Encoding: gzip"and real load tests to confirm the latency win.