Designing Resilient APIs: Timeouts, Retries, and Backpressure
The three patterns that separate APIs that survive production from the ones that fall over at the first traffic spike — with concrete defaults you can ship today.

Every API works on a developer’s laptop. The interesting question is what happens when a downstream dependency gets slow, a deploy doubles latency, or a batch job floods you with ten times the usual traffic. Resilience is not a library you install — it’s a set of deliberate decisions. Here are the three that matter most.
1. Every network call needs a timeout
The single most common production incident is a thread (or event-loop task) blocked forever on a call that will never return. Without a timeout, one slow dependency cascades: connections pile up, pools exhaust, and your healthy service starts failing too.
// Java — set timeouts explicitly. The defaults are almost always "infinite".
HttpClient client = HttpClient.newBuilder()
.connectTimeout(Duration.ofSeconds(2))
.build();
HttpRequest request = HttpRequest.newBuilder(URI.create(url))
.timeout(Duration.ofSeconds(3)) // request-level read timeout
.build();
A good rule of thumb: a timeout should be a small multiple of your p99 latency, not your average. If p99 is 200ms, a 2s timeout gives generous headroom while still failing fast.
2. Retry — but only safe operations, with jitter
Retries turn transient blips into successes. They also turn a small outage into a self-inflicted DDoS if you do them wrong.
Retry idempotent operations only. Never blindly retry a
POSTthat creates a resource unless you have an idempotency key.
Use exponential backoff with full jitter so retries don’t synchronize into a thundering herd:
async function withRetry<T>(fn: () => Promise<T>, attempts = 3): Promise<T> {
for (let i = 0; ; i++) {
try {
return await fn();
} catch (err) {
if (i >= attempts - 1) throw err;
const base = 100 * 2 ** i;
const delay = Math.random() * base; // full jitter
await new Promise((r) => setTimeout(r, delay));
}
}
}
3. Backpressure: shed load before it sheds you
When you cannot keep up, the worst thing to do is accept everything and queue it indefinitely. Bounded queues, concurrency limits, and load shedding let you degrade gracefully instead of collapsing.
- Bounded concurrency — cap in-flight requests to a dependency.
- Reject early — return
429or503when a queue is full. A fast failure is recoverable; a slow timeout is not. - Circuit breakers — stop calling a dependency that’s clearly down and give it room to recover.
Sensible defaults to start with
| Concern | Default |
|---|---|
| Connect timeout | 1–2s |
| Read timeout | 2–5× p99 |
| Retries | 2 (idempotent only) |
| Backoff | Exponential + full jitter |
| Circuit breaker | Trip at 50% errors over 10s |
Resilience compounds. Add timeouts first, then retries, then backpressure — and you’ll have an API that bends under load instead of breaking.
Related articles

Anthropic Joins Frontier's Carbon-Removal Coalition — While Fighting an Export-Control Fire
On June 17, 2026, Anthropic became the first dedicated AI company to join Frontier, the advance-market-commitment coalition for permanent carbon removal, helping push its pledges to $1.8B. It happened in the middle of a high-profile U.S. export-control dispute over Claude Fable 5 and Mythos 5. Here's both stories.

Tim Cook: Apple Price Hikes Are 'Unavoidable' as AI Drains the Memory Market
In a WSJ interview published June 17, 2026, Apple CEO Tim Cook confirmed price increases are coming, blaming a 'hundred-year flood' in memory and storage chip costs driven by AI data-center demand. Here's why DRAM and NAND prices spiked, what could get pricier, and the context behind the move.

Intel Jumps ~9% on Apple Chip Partnership: What's Behind the Surge
Intel shares spiked nearly 9% in pre-market on June 18, 2026 after President Trump said Apple has agreed to partner with Intel to design and make chips in the U.S. Here's what the deal reportedly involves, why it's a big deal for Intel's foundry turnaround, and the caveats worth keeping in mind.
Have a project or an idea?
We don't just write about software — we build it. Tell us what you're working on and we'll get back within 1–2 business days.