Log Compaction
Most Kafka topics throw data away after a time or size limit, but some data is better thought of as state rather than a stream of events — the current address for a user, the latest configuration for a service, the committed offset for a consumer group. Log compaction is the storage strategy that preserves the most recent value for every key indefinitely, while still reclaiming space from keys that have been overwritten. It turns a topic into a durable, replayable key-value snapshot, which is exactly what powers KTable changelogs, the internal __consumer_offsets topic, and event-sourced aggregates.
How compaction differs from retention
By default a topic uses cleanup.policy=delete, which prunes whole log segments once they exceed retention.ms or retention.bytes. Compaction is a different policy entirely. With cleanup.policy=compact, Kafka guarantees that for any key written to the topic, a consumer reading from the beginning will always see at least the last value written for that key. Older values for the same key are eligible to be garbage-collected, but the latest one is never removed by compaction.
The two policies are not mutually exclusive. Setting cleanup.policy=compact,delete keeps the latest value per key and enforces a retention window, so even compacted records eventually age out. This is precisely how __consumer_offsets is configured.
| Policy | Behaviour | Typical use |
|---|---|---|
delete | Remove old segments by time/size | Event streams, logs, metrics |
compact | Keep latest value per key forever | Changelogs, config, snapshots |
compact,delete | Latest per key, but also age out | __consumer_offsets, bounded state |
Tombstones: deleting a key
To delete a key from a compacted topic you produce a record with that key and a null value — a tombstone. The compactor keeps the tombstone around long enough for all consumers to observe the deletion (controlled by delete.retention.ms, default 24 hours), then physically removes both the tombstone and any prior values for the key.
ProducerRecord<String, String> tombstone =
new ProducerRecord<>("user-profiles", "user-42", null);
producer.send(tombstone);
A tombstone is the only way to remove a key from a compacted topic. Simply ceasing to produce a key never deletes its last value — that is the whole point of compaction.
The dirty and clean log
A compacted partition is split into two regions. The clean region has already been compacted and contains at most one record per key. The dirty (or “head”) region holds records appended since the last compaction and may contain duplicate keys. The active segment is always dirty and is never compacted, so the most recent writes are off-limits to the cleaner.
A background pool of log cleaner threads selects partitions to clean. It builds an in-memory offset map of the most recent offset per key in the dirty region, then rewrites the segments, dropping any record whose offset is lower than the map’s entry for that key.
Before compaction (dirty log):
offset: 0 1 2 3 4 5 6
key: A B A C B A C
value: a1 b1 a2 c1 b2 a3 c2
After compaction (clean log):
offset: 1 4 5 6
key: B B A C
value: b1 b2 a3 c2
(A=a1,a2 and C=c1 superseded and removed)
Note that offsets are not renumbered — consumers may see gaps where superseded records used to be, which is normal and expected.
Triggering compaction: the dirty ratio
The cleaner does not run continuously on every partition; it prioritises the partition with the highest dirty ratio (the fraction of the log, by bytes, that is uncompacted). A partition becomes eligible once its dirty ratio exceeds min.cleanable.dirty.ratio (default 0.5).
| Config | Default | Purpose |
|---|---|---|
min.cleanable.dirty.ratio | 0.5 | Dirty fraction needed before cleaning |
min.compaction.lag.ms | 0 | Minimum age before a record can be compacted |
max.compaction.lag.ms | Long.MAX | Force compaction even if ratio is low |
delete.retention.ms | 86400000 | How long tombstones survive |
segment.ms / segment.bytes | — | Roll segments so the head becomes cleanable |
Lowering min.cleanable.dirty.ratio makes compaction more aggressive (less wasted space, more I/O); raising it batches more work per pass. Use min.compaction.lag.ms when consumers need a guaranteed window to read every intermediate update before it can be collapsed.
Creating and configuring a compacted topic
kafka-topics.sh --bootstrap-server localhost:9092 \
--create --topic user-profiles \
--partitions 6 --replication-factor 3 \
--config cleanup.policy=compact \
--config min.cleanable.dirty.ratio=0.1 \
--config delete.retention.ms=3600000
You can verify and tune an existing topic with kafka-configs.sh:
kafka-configs.sh --bootstrap-server localhost:9092 \
--entity-type topics --entity-name user-profiles --describe
Output:
Dynamic configs for topic user-profiles are:
cleanup.policy=compact sensitive=false synonyms={...}
min.cleanable.dirty.ratio=0.1 sensitive=false synonyms={...}
delete.retention.ms=3600000 sensitive=false synonyms={...}
In Spring Boot you can declare the same topic as a bean so it is created automatically:
@Configuration
public class TopicConfig {
@Bean
NewTopic userProfiles() {
return TopicBuilder.name("user-profiles")
.partitions(6)
.replicas(3)
.config(TopicConfig.CLEANUP_POLICY_CONFIG,
TopicConfig.CLEANUP_POLICY_COMPACT)
.config(TopicConfig.MIN_CLEANABLE_DIRTY_RATIO_CONFIG, "0.1")
.build();
}
}
Use cases
- Kafka Streams changelogs — every
KTableand state store is backed by a compacted changelog topic so it can be rebuilt on restart or rebalance from the latest value per key. __consumer_offsets— the internal topic usescompact,deletekeyed by(group, topic, partition)so the broker always knows each group’s latest committed offset.- Configuration / feature-flag topics — services bootstrap by reading the topic to head and caching the current value per key.
- Event sourcing snapshots — store the materialised latest state per aggregate id while keeping the immutable event stream on a separate
deletetopic.
Compaction only works when records are keyed. A null key cannot be compacted, and the cleaner will refuse to compact a topic that contains null-keyed records.
Best Practices
- Always produce records with a meaningful, stable key — compaction is a no-op without keys.
- Use
compact,deletewith a sensibleretention.mswhen state can grow unbounded over time; purecompactkeeps every live key forever. - Set
delete.retention.mslonger than your slowest consumer’s worst-case lag so no consumer misses a tombstone. - Lower
min.cleanable.dirty.ratiofor hot key-value topics to bound disk usage; leave the default for low-churn config topics. - Size segments (
segment.ms,segment.bytes) deliberately — the active segment is never compacted, so an overly large head delays cleaning. - Monitor the
LogCleanermetrics (max-dirty-percent, cleaner thread health); a dead cleaner silently lets disk usage grow. - Treat null-valued records as deletes everywhere in your consumers, including stream processors and sink connectors.