Avoiding Event Loop Blocking
Node.js runs your JavaScript on a single thread, so every callback, promise continuation, and timer takes its turn on one shared event loop. When a single piece of synchronous code runs for too long — parsing a giant JSON payload, hashing a password with too many rounds, or looping over a million records — nothing else can run. Incoming requests queue up, timers fire late, and health checks time out. A function that takes 200 ms doesn’t slow down one request; it adds 200 ms of latency to every request in flight. Keeping the loop free is the single most important performance skill in Node.
Why one slow function hurts everything
The event loop processes work in phases (timers, I/O callbacks, setImmediate, etc.) and only moves to the next item when the current callback returns. Asynchronous I/O — disk reads, network calls, database queries — is offloaded to the libuv thread pool or the OS and does not occupy the loop while it waits. Pure CPU work has nowhere to go: it runs inline and holds the thread hostage.
import { createServer } from "node:http";
createServer((req, res) => {
if (req.url === "/block") {
// Synchronous busy loop — blocks the loop for ~2 seconds
const end = Date.now() + 2000;
while (Date.now() < end) {}
}
res.end("ok\n");
}).listen(3000);
Output:
$ curl localhost:3000/fast & # returns instantly... normally
$ curl localhost:3000/block # but this stalls the whole process
# /fast now also waits ~2s because the loop is busy
The /fast request has no CPU work of its own, yet it is delayed because the loop is stuck inside /block.
Detecting blocking
You cannot fix what you cannot see. The clearest signal is event loop delay — how late timers fire relative to when they were scheduled. Node exposes this directly via perf_hooks.
import { monitorEventLoopDelay } from "node:perf_hooks";
const h = monitorEventLoopDelay({ resolution: 20 });
h.enable();
setInterval(() => {
// Values are in nanoseconds
console.log(`p99 loop delay: ${(h.percentile(99) / 1e6).toFixed(1)} ms`);
h.reset();
}, 1000);
Output:
p99 loop delay: 1.4 ms
p99 loop delay: 2.1 ms
p99 loop delay: 2014.7 ms <-- a blocking task ran here
Other useful tools:
| Tool | What it shows | When to use |
|---|---|---|
monitorEventLoopDelay | Live histogram of loop lag | Production metrics, alerting |
--prof + node --prof-process | V8 sampling profiler output | Finding the hot function |
node --cpu-prof | .cpuprofile for Chrome DevTools | Visual flame graphs |
clinic flame / 0x | Flame graphs from real load | Pinpointing CPU hotspots |
A sustained event loop delay above ~50-100 ms at p99 almost always means CPU-bound code on the main thread. Alert on it; do not wait for users to report slowness.
Offloading to worker threads
When a task is genuinely CPU-heavy and unavoidable — image resizing, cryptography, compression, large-scale parsing — move it off the main thread with the worker_threads module. Each worker has its own V8 instance and event loop, so the work runs truly in parallel without touching your request-handling loop.
// pool.js — run CPU work in a worker, await the result
import { Worker } from "node:worker_threads";
export function runTask(workerData) {
return new Promise((resolve, reject) => {
const worker = new Worker(new URL("./worker.js", import.meta.url), {
workerData,
});
worker.once("message", resolve);
worker.once("error", reject);
worker.once("exit", (code) => {
if (code !== 0) reject(new Error(`Worker exited with ${code}`));
});
});
}
// worker.js — the heavy computation lives here
import { workerData, parentPort } from "node:worker_threads";
import { createHash } from "node:crypto";
let hash = workerData;
for (let i = 0; i < 1_000_000; i++) {
hash = createHash("sha256").update(hash).digest("hex");
}
parentPort.postMessage(hash);
The main thread stays responsive while worker.js grinds through a million hash rounds. In a real service, keep a fixed pool of workers (one per CPU core) rather than spawning a new one per request — worker startup costs tens of milliseconds. Libraries like piscina provide a battle-tested pool, or you can build one around the snippet above.
Breaking up long tasks
Not every long task deserves a worker. If the work is mostly synchronous JavaScript over a large collection, you can yield the loop periodically so other callbacks get a turn. Slice the work into batches and hand control back with setImmediate between batches.
// Process a huge array without starving the loop
async function processInBatches(items, batchSize, handle) {
for (let i = 0; i < items.length; i += batchSize) {
const batch = items.slice(i, i + batchSize);
for (const item of batch) handle(item);
// Yield: let queued I/O and timers run before the next batch
await new Promise((resolve) => setImmediate(resolve));
}
}
await processInBatches(records, 1000, (r) => transform(r));
setImmediate schedules the continuation after the current I/O phase, so pending requests are serviced between batches. Prefer it over setTimeout(fn, 0), which is clamped to a minimum delay and adds avoidable latency.
A few common offenders and their fixes:
| Blocking pattern | Fix |
|---|---|
fs.readFileSync in a request path | Use await fs.readFile (async) |
JSON.parse on multi-MB payloads | Stream-parse, or do it in a worker |
crypto.pbkdf2Sync / bcrypt sync | Use the async variants |
| Tight loops over big arrays | Batch with setImmediate, or use a worker |
| Synchronous template/regex on huge input | Bound input size; offload heavy cases |
Best Practices
- Treat the event loop as a shared resource: any synchronous function over ~10 ms is a latency tax on every concurrent request.
- Monitor
monitorEventLoopDelayp99 in production and alert when it crosses your latency budget. - Always prefer the asynchronous form of core APIs (
fs.readFile,crypto.pbkdf2,zlib.gzip) over their*Synccounterparts in hot paths. - Offload genuinely CPU-bound work to a pooled set of
worker_threads, sized to your core count — never spawn a worker per request. - Break large synchronous loops into batches and yield with
setImmediateso I/O and timers stay responsive. - Profile before optimizing: use
--cpu-proforclinic flameto find the actual hot function instead of guessing. - Cap the size of untrusted input (request bodies, uploads) so a single payload can’t monopolize the loop.