Topics, Partitions & Offsets

Topics, partitions, and offsets are the three concepts that everything else in Kafka is built on. A topic is the logical name you produce to and consume from; a partition is the physical, append-only log that actually stores the records and gives Kafka its parallelism; and an offset is the position of a record within a partition. Get the mental model of these three right and the rest of Kafka — scaling, ordering, replication, consumer groups — follows naturally.

Topics as named streams

A topic is a named, durable stream of records. Producers write records to a topic and consumers read from it, with no direct coupling between the two — a topic can have many producers and many independent consumers, and records are retained for a configurable time regardless of whether anyone has read them yet. This decoupling is what makes Kafka an event log rather than a queue: reading a record does not remove it.

A topic is purely a logical grouping. The actual storage and scalability come from the partitions underneath it. You create a topic with an explicit partition count and replication factor:

kafka-topics.sh --bootstrap-server broker1:9092 \
  --create --topic orders \
  --partitions 3 --replication-factor 3

Partitions: the unit of parallelism

Each topic is split into one or more partitions, and a partition is an ordered, immutable, append-only log. New records are appended only at the end; existing records are never modified in place. Partitions are what Kafka actually distributes across brokers and what allows a single topic to scale far beyond one machine’s throughput.

Partitions matter because they are the unit of parallelism on every axis:

Storage — a topic’s data is spread across its partitions, which live on different brokers, so total capacity scales with partition count.
Write throughput — producers can write to all partitions of a topic concurrently.
Consumption — within a consumer group, each partition is consumed by exactly one consumer, so the maximum useful consumer parallelism equals the partition count.

Because a partition lives entirely on its leader broker (with followers replicating it), all ordering and offset guarantees are scoped to a single partition — never to the topic as a whole.

Offsets: position within a partition

Every record appended to a partition is assigned a monotonically increasing 64-bit integer called its offset. The offset is unique within that partition and reflects the order of arrival. The first record is at offset 0, the next at 1, and so on. Offsets are never reused or reassigned, even after older records are deleted by retention.

A record’s full identity is therefore the triple (topic, partition, offset). The same offset value — say 5 — exists independently in every partition and refers to completely different records. Consumers track their progress by committing the offset of the last record they processed, so they can resume from exactly where they left off after a restart or rebalance.

A topic with partitions and offsets

The diagram below shows the topic orders with three partitions. Each partition is an independent log with its own offset sequence, and producers append to the tail (the highest offset) of whichever partition a record is routed to.

 Topic: orders
 ┌──────────────────────────────────────────────┐
 │ Partition 0   offset: 0   1   2   3   4 ──► append
 │               recs:  [A] [B] [C] [D] [E]
 ├──────────────────────────────────────────────┤
 │ Partition 1   offset: 0   1   2 ──► append
 │               recs:  [F] [G] [H]
 ├──────────────────────────────────────────────┤
 │ Partition 2   offset: 0   1   2   3 ──► append
 │               recs:  [I] [J] [K] [L]
 └──────────────────────────────────────────────┘
 offsets are unique PER PARTITION — offset 2 means
 a different record in each of the three partitions

Ordering: per-partition, not global

Kafka guarantees ordering within a single partition only. If two records are written to partition 0, the one with the lower offset is always read first by any consumer. There is no global ordering across partitions — once orders has three partitions, Kafka makes no promise about the relative order of a record in partition 0 versus one in partition 1.

This is the single most important trade-off in Kafka’s design. More partitions buy you parallelism and throughput, but they cost you total ordering. The practical consequence: if a set of records must be processed in order relative to each other (for example, all events for one customer or one order), they must land in the same partition. You control that by choosing a partition key — records with the same key are always routed to the same partition.

// Same key ("order-42") => same partition => ordered relative to each other
var producer = new KafkaProducer<String, String>(props);
producer.send(new ProducerRecord<>("orders", "order-42", "CREATED"));
producer.send(new ProducerRecord<>("orders", "order-42", "PAID"));
producer.send(new ProducerRecord<>("orders", "order-42", "SHIPPED"));
producer.flush();

Tip: If you genuinely need strict global ordering for an entire topic, you must use a single partition — which caps you at one consumer per group and one broker’s throughput. Prefer per-key ordering with a sensible key instead; it is almost always the right granularity.

You can inspect partition boundaries and offsets directly with the CLI:

kafka-get-offsets.sh --bootstrap-server broker1:9092 --topic orders

Output:

orders:0:5
orders:1:3
orders:2:4

Each line is topic:partition:end-offset, where the end offset is the offset that will be assigned to the next record appended to that partition.

Concept	Scope	Guarantee
Topic	Logical name	Groups partitions; no ordering of its own
Partition	One broker (leader)	Append-only, strict order by offset
Offset	Within one partition	Unique, monotonically increasing, never reused
Ordering	Per partition	Total order inside a partition, none across partitions

Reading offsets in Spring Boot

Spring for Apache Kafka surfaces the partition and offset of every record on the ConsumerRecord, which is invaluable for logging, idempotency, and debugging ordering issues.

@Component
public class OrderListener {

    @KafkaListener(topics = "orders", groupId = "orders-service")
    public void onOrder(ConsumerRecord<String, String> record) {
        // Each record carries its exact position: (topic, partition, offset)
        System.out.printf("p=%d offset=%d key=%s value=%s%n",
                record.partition(), record.offset(),
                record.key(), record.value());
    }
}

Best Practices

Choose a partition count that gives you headroom for consumer parallelism — you can never have more active consumers in a group than partitions.
Use a meaningful partition key (customer id, order id, account id) so records that must stay ordered share a partition.
Treat ordering as per-key, not per-topic; design around the fact that there is no global order across partitions.
Track progress with (topic, partition, offset) and make consumers idempotent, since at-least-once delivery can replay an offset.
Over-provision partitions modestly at creation time; adding partitions later changes key-to-partition routing and breaks existing key ordering.
Reserve single-partition topics for the rare cases that truly require strict total ordering, and accept the throughput ceiling that comes with them.