Node.js Performance Overview
Node.js is fast for the workloads it was designed for — high-concurrency, I/O-bound services — and surprisingly slow for the ones it wasn’t, like number-crunching on a single thread. Understanding why comes down to one architectural fact: a single event loop thread runs all of your JavaScript. Every performance decision in Node ultimately revolves around keeping that thread free to do useful work. This page maps out the major performance concerns — the event loop, CPU-bound work, memory, and I/O — so the deep-dive pages in this section have a shared frame of reference.
Throughput vs latency
Two metrics dominate any performance conversation, and they are not the same thing.
- Throughput is how many operations you complete per second (requests/sec, messages/sec). Node excels here for I/O-bound work because thousands of in-flight requests can wait on sockets and disk simultaneously without one thread per request.
- Latency is how long a single operation takes from start to finish, including any time it spent queued behind other work.
The catch: in Node, high tail latency (p99) is almost always a symptom of the event loop being blocked. One slow synchronous operation delays every other pending callback, so a single CPU-heavy request can spike latency for hundreds of unrelated ones.
The event loop is the bottleneck that matters
Node runs your JavaScript on a single thread driven by the event loop. While that thread is executing a function, nothing else happens — no new connections are accepted, no timers fire, no I/O callbacks run. This is why “blocking the event loop” is the cardinal sin of Node performance.
import { createServer } from "node:http";
createServer((req, res) => {
if (req.url === "/blocking") {
// Synchronous CPU work — freezes the WHOLE server
let sum = 0;
for (let i = 0; i < 5_000_000_000; i++) sum += i;
res.end(`sum=${sum}`);
} else {
res.end("fast");
}
}).listen(3000);
Output:
$ curl localhost:3000/fast # returns instantly...
fast
$ curl localhost:3000/blocking & # ...but while this runs (several seconds)
$ curl localhost:3000/fast # every other request is stalled
The fix is never to do long synchronous work on the main thread. Offload it (worker threads, child processes, or a separate service), or break it into chunks that yield back to the loop.
Rule of thumb: if a single function call takes longer than a few milliseconds, it does not belong on the event loop. Measure it before you assume it’s cheap.
CPU-bound vs I/O-bound work
The single biggest predictor of whether Node is the right tool is the ratio of CPU work to I/O work.
| Workload | Example | Node fit |
|---|---|---|
| I/O-bound | API gateways, proxies, real-time chat, CRUD APIs | Excellent — concurrency is nearly free |
| Mixed | Templating, light transforms over DB results | Good — watch for hot loops |
| CPU-bound | Image processing, crypto hashing, ML inference | Poor on the main thread — use workers or native addons |
For CPU-bound tasks, the libuv thread pool (used by crypto, zlib, and fs) handles some work off-thread, and worker_threads lets you run JavaScript in parallel. Anything you can’t parallelize that way is a sign you may want a different runtime or a native module.
Memory and garbage collection
Node uses V8’s garbage collector, which periodically pauses your code to reclaim memory. Small, short-lived allocations are cheap; large heaps and high allocation rates are not, because GC pauses also block the event loop. Common memory pitfalls:
- Leaks from unbounded caches, growing arrays, or listeners that are never removed.
- Heap pressure from buffering large payloads in memory instead of streaming them.
- Large object churn that forces frequent major GC cycles.
// Inspect memory at runtime
const { rss, heapUsed, heapTotal } = process.memoryUsage();
console.log({
rssMB: (rss / 1024 / 1024).toFixed(1),
heapUsedMB: (heapUsed / 1024 / 1024).toFixed(1),
heapTotalMB: (heapTotal / 1024 / 1024).toFixed(1),
});
Output:
{ rssMB: '78.3', heapUsedMB: '12.1', heapTotalMB: '20.5' }
By default V8 caps the old-space heap around 2 GB (4 GB on 64-bit modern Node). Raise it with --max-old-space-size=4096 only after confirming the growth is legitimate, not a leak.
I/O: where Node shines
Asynchronous I/O is Node’s superpower. Because reads and writes don’t block the thread, a single process can keep tens of thousands of connections open. To stay on the fast path:
- Use streams for large files and responses so you never hold the whole payload in memory.
- Pool database and HTTP connections instead of opening one per request.
- Cache expensive results to turn repeated I/O into cheap memory lookups.
- Run independent async operations concurrently with
Promise.allrather than awaiting them in sequence.
// Sequential: total time = sum of both
const a = await fetch("https://api.example.com/a");
const b = await fetch("https://api.example.com/b");
// Concurrent: total time = the slower of the two
const [c, d] = await Promise.all([
fetch("https://api.example.com/c"),
fetch("https://api.example.com/d"),
]);
Measure before you optimize
Guessing at bottlenecks wastes time. Start with the built-in tooling:
node --profplusnode --prof-processfor a CPU profile.--inspectwith Chrome DevTools ornode --cpu-prof/--heap-proffor flame graphs and heap snapshots.perf_hooks(performance.now(),PerformanceObserver) for in-app timing.process.hrtime.bigint()for nanosecond-precision microbenchmarks.
Best Practices
- Keep the event loop free: never run long synchronous work on the main thread.
- Classify each task as CPU-bound or I/O-bound and route CPU-heavy work to worker threads or native code.
- Stream large data instead of buffering it; pool and cache I/O wherever you can.
- Run independent async work concurrently with
Promise.all, not oneawaitat a time. - Watch p99 latency and event-loop lag, not just average throughput — tail latency reveals blocking.
- Profile with real tooling before changing code; optimize the proven hot path, not your guess.