Message Keys & Partitioning

Every record a producer sends lands in exactly one partition of a topic, and the partition it lands in is the single most important decision in your data model. The key you attach to a record controls that placement: it determines ordering guarantees, how evenly load spreads across brokers and consumers, and whether related events stay together. Get keys right and you get clean per-entity ordering for free; get them wrong and you get hot partitions, lagging consumers, and broken ordering that is painful to fix after the fact.

How the default partitioner works

When you call send(), the producer decides the target partition before the record is batched and transmitted. Kafka’s built-in DefaultPartitioner (and its modern successor, the built-in partitioning logic in KafkaProducer) follows three rules:

Explicit partition — if you pass a partition number to the ProducerRecord, it is used verbatim.
Keyed record — if the key is non-null, the partition is murmur2(serializedKey) % numPartitions. The same key therefore always maps to the same partition (as long as the partition count is unchanged).
Null-key record — if the key is null, the producer uses sticky partitioning: it fills one partition with a batch, then “sticks” to another for the next batch, spreading load while maximizing batch sizes.

The hash is computed over the serialized key bytes, not the Java object, so your key serializer matters. Two keys that are .equals() but serialize to different bytes will land on different partitions.

key = "user-42"  ->  murmur2(bytes) = 0x9F3A...  ->  % 6  ->  partition 3
key = "user-42"  ->  (same bytes)               ->  % 6  ->  partition 3   (always)
key = null       ->  sticky batch               ->  partition 0,then 4,... (round-ish robin)

Same key, same partition, same order

Kafka only guarantees ordering within a partition. Because a given key always hashes to one partition, all records sharing that key are appended in send order and consumed in that same order. This is how you get per-entity ordering without global coordination: key by the entity whose timeline must stay consistent.

For an order-management system, keying by orderId guarantees that OrderCreated, OrderPaid, and OrderShipped for the same order are processed in sequence, even while millions of unrelated orders are processed in parallel across other partitions.

Ordering is per partition, never per topic. If you need a strict global order across all records, you need a single-partition topic — which caps throughput to one consumer. Almost always, key-scoped ordering is what you actually want.

Producing keyed records

With Spring for Apache Kafka, the key is the first argument to send(). Use a record as the value DTO and let the serializer handle the bytes.

public record OrderEvent(String orderId, String type, long amountCents) {}

@Service
public class OrderProducer {

    private final KafkaTemplate<String, OrderEvent> kafkaTemplate;

    public OrderProducer(KafkaTemplate<String, OrderEvent> kafkaTemplate) {
        this.kafkaTemplate = kafkaTemplate;
    }

    public void publish(OrderEvent event) {
        // The orderId is the key -> all events for one order share a partition
        kafkaTemplate.send("orders", event.orderId(), event)
            .whenComplete((result, ex) -> {
                if (ex == null) {
                    var meta = result.getRecordMetadata();
                    System.out.printf("key=%s -> partition=%d offset=%d%n",
                        event.orderId(), meta.partition(), meta.offset());
                }
            });
    }
}

Sending a few events for two orders on a 6-partition topic produces a stable assignment:

**Output:**

key=order-1001 -> partition=4 offset=12
key=order-1001 -> partition=4 offset=13
key=order-2002 -> partition=1 offset=58
key=order-1001 -> partition=4 offset=14
key=order-2002 -> partition=1 offset=59

Notice every order-1001 event lands on partition 4 and every order-2002 event lands on partition 1, preserving each order’s internal ordering independently.

You can confirm placement from the CLI without writing code:

kafka-console-producer.sh \
  --bootstrap-server localhost:9092 \
  --topic orders \
  --property "parse.key=true" \
  --property "key.separator=:"
# then type:  order-1001:{"type":"OrderPaid"}

Choosing a good key

The key should match the granularity of the ordering and aggregation you need. A few guidelines:

Goal	Good key choice	Why
Per-user ordering	`userId`	All of a user’s events stay ordered together
Per-order workflow	`orderId`	Lifecycle events processed in sequence
Even load, no ordering need	`null` (sticky)	Lets the broker balance batches automatically
Co-locate related data	shared business key	Enables local joins/state in stream processors

The cardinality of the key space matters: you want many more distinct keys than partitions, and roughly uniform traffic per key. Low-cardinality keys (e.g., country with most traffic from one country) concentrate load.

Hot keys and partition skew

The biggest failure mode is skew: one key (or a handful) receives a disproportionate share of traffic, overloading a single partition. Because that partition is owned by one consumer in a group, that consumer becomes a bottleneck while others sit idle — and the lag is invisible if you only watch topic-level metrics.

Common causes and mitigations:

A celebrity/whale entity (a viral user, a mega-merchant). Mitigate by salting the key: append a small random suffix (userId + "-" + (n % 4)) to fan a hot entity across N partitions, accepting that you lose strict ordering for that one entity.
Adding partitions later. Increasing partition count changes hash % numPartitions, so existing keys re-map and historical ordering across the boundary is lost. Plan partition counts up front.
Skewed natural keys. If a business key is inherently lopsided, switch to a composite key or a custom partitioner.

Never decrease partitions (Kafka forbids it) and treat increasing them as a one-way migration. Over-provision partitions modestly at creation time rather than reshaping a live topic.

Best Practices

Key by the entity whose ordering you must preserve — usually an ID, never a wide-open low-cardinality field.
Keep key cardinality well above partition count and aim for uniform traffic per key.
Leave the key null only when you genuinely do not need ordering; sticky partitioning then balances load efficiently.
Monitor per-partition produce rate and consumer lag, not just topic totals, to catch skew early.
Decide partition count deliberately at topic creation; changing it later re-hashes keys and breaks cross-boundary ordering.
Salt or shard hot keys when a single entity dominates, trading per-entity ordering for balanced throughput.
Ensure your key serializer is deterministic so identical keys always produce identical bytes.