Skip to content
Apache Kafka kf reliability 4 min read

Idempotence in Depth

Producer idempotence is the feature that turns risky retries into safe ones. Without it, a producer that retries after a network hiccup can write the same record twice, silently corrupting downstream aggregates and counts. With idempotence enabled, the broker recognizes and discards those duplicates, giving you exactly-once delivery into a partition within a single producer session. Understanding the mechanism — and its boundaries — is essential before you rely on it in production.

Why retries create duplicates

A producer sends a batch, the broker appends it to the log and replicates it, then sends back an acknowledgement. If that acknowledgement is lost on the way back (a timeout, a broken connection, a leader failover), the producer has no way to know whether the write actually succeeded. Its only safe option is to retry. If the original write did succeed, the retry produces a duplicate.

This is the classic “did my message land?” ambiguity. Idempotence resolves it by giving the broker enough state to recognize a retried batch and append it only once.

The mechanism: PID, epoch, and sequence numbers

When idempotence is enabled, the producer registers with the broker and is assigned a Producer ID (PID) — a unique numeric identity for that producer session. Alongside the PID is a producer epoch, a monotonically increasing number used to fence out zombie producers (an older instance with a stale epoch is rejected).

For every (PID, partition) pair, the producer attaches a monotonically increasing sequence number to each batch, starting at 0. The broker tracks the last sequence number it successfully appended for each (PID, partition). On each incoming batch it compares:

Incoming sequenceBroker’s last seenBroker action
last + 1lastAppend the batch, advance the counter
<= last (a retry)lastRecognized as a duplicate; discard, return success
> last + 1 (a gap)lastReject with OutOfOrderSequenceException

Because the sequence number travels with the batch and the broker remembers the high-water mark, a retried batch carries a sequence the broker has already seen — so it is silently dropped while still returning a successful ack. The producer never learns a duplicate occurred, and the log contains the record exactly once.

Producer (PID=42, epoch=0)
  -> partition 0: seq 0, 1, 2 ...
  -> partition 1: seq 0, 1, 2 ...

Broker memory per (PID, partition):
  (42, 0) -> last appended seq = 2
  (42, 1) -> last appended seq = 2

Retry of (42, 0, seq 2)  =>  seq 2 <= 2  =>  duplicate, dropped

The dedup window

The broker does not remember every sequence number forever. It keeps the last five batches per (PID, partition) in memory. This is exactly why max.in.flight.requests.per.connection must be 5 or less when idempotence is on — the broker can only detect and reorder duplicates within that sliding window. Set it higher and a retry could fall outside the window, defeating dedup and risking reordering.

Enabling it

In modern clients idempotence is the default. To be explicit:

Properties props = new Properties();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "broker1:9092");
props.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, true);
// Implied/required when idempotence is true:
props.put(ProducerConfig.ACKS_CONFIG, "all");
props.put(ProducerConfig.MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION, 5);
props.put(ProducerConfig.RETRIES_CONFIG, Integer.MAX_VALUE);

props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());

In Spring Boot, set it in application.yml:

spring:
  kafka:
    producer:
      acks: all
      retries: 2147483647
      properties:
        enable.idempotence: true
        max.in.flight.requests.per.connection: 5

Gotcha: Enabling idempotence forces acks=all. If you explicitly set acks=1 or acks=0 alongside enable.idempotence=true, the producer throws a ConfigException at startup rather than silently weakening the guarantee.

What it guarantees — and what it does not

Idempotence guarantees that, within a single producer session, records are written to each partition exactly once and in order, even across arbitrary retries. That is its entire scope.

It does not survive a producer restart. When a producer process dies and a new one starts, it gets a brand-new PID. The broker has no way to associate the new session with the old one, so a record the previous session sent — and which the application “thinks” failed — can be sent again by the new session and appended as a fresh record.

Session A (PID=42): sends order-123 -> appended at offset 100
Session A crashes before recording success
Session B (PID=99): re-sends order-123 -> appended at offset 250  (DUPLICATE)

It also does not deduplicate across different producers writing the same logical event, nor does it make a multi-partition or multi-topic write atomic. For cross-session and multi-partition atomicity you need transactions layered on top of idempotence.

Requirements recap

RequirementValueReason
enable.idempotencetrueTurns on PID + sequence tracking
acksallBroker must confirm replication before dedup is meaningful
max.in.flight.requests.per.connection<= 5Bounded by the broker’s dedup window
retries> 0 (default MAX_VALUE)Idempotence is only useful when retries can happen

Best Practices

  • Leave enable.idempotence=true (the default) on every producer; it is cheap and removes a whole class of subtle data bugs.
  • Never override acks to anything but all or max.in.flight above 5 on an idempotent producer — both silently undermine the guarantee or fail startup.
  • Treat idempotence as a single-session tool; if you need duplicate-free delivery across restarts, reach for transactions and a stable transactional.id.
  • Add a business-level idempotency key (order ID, event ID) for consumers, so cross-session duplicates from a producer restart can still be filtered downstream.
  • Monitor for OutOfOrderSequenceException in producer logs — it usually signals message loss upstream or a misconfigured in-flight limit, not a transient blip.
  • Don’t assume idempotence makes a multi-partition write atomic; that is the job of transactions.
Last updated June 1, 2026
Was this helpful?