Skip to content
Apache Kafka kf performance 4 min read

Partition Sizing & Count

Partition count is the single most consequential sizing decision you make for a Kafka topic. It sets the ceiling on consumer parallelism, drives per-broker resource consumption, and is awkward to change after the fact — you can add partitions but never remove them, and adding them breaks key-based ordering. Getting it right means balancing two competing pressures: enough partitions to hit your throughput and parallelism targets, but not so many that brokers drown in open files, replication overhead, and slow leader elections.

Why partition count matters

A partition is the unit of parallelism in Kafka. Within a consumer group, each partition is consumed by at most one consumer, so a topic with 6 partitions can be drained by at most 6 active consumers — the 7th sits idle. Partitions are also the unit of ordering (Kafka guarantees order only within a partition) and the unit of replication and leadership on the broker side.

Because consumer parallelism is capped by partition count, under-provisioning forces you to either accept lag or perform a disruptive repartition. Over-provisioning is equally real but quieter: every partition replica is an open file handle, a chunk of page cache, and an entry that the controller must track and re-elect during failover.

Sizing from throughput

The throughput approach estimates how many partitions you need to move your target volume, given what a single partition can sustain. Measure (or benchmark) the per-partition throughput your producers and consumers actually achieve — it depends on message size, batching, compression, and downstream processing cost.

partitions_throughput = max( target_throughput / producer_per_partition ,
                             target_throughput / consumer_per_partition )

Always size against the slower side. A producer that writes 50 MB/s per partition is irrelevant if each consumer thread only processes 5 MB/s — the consumer dictates the count.

Sizing from parallelism

The parallelism approach starts from how many consumer instances you intend to run concurrently. If you plan to scale a consumer group out to 12 pods at peak, you need at least 12 partitions or the extra pods do no work.

partitions_parallelism = max_concurrent_consumers

Take the maximum of the two estimates, then add headroom so you can scale consumers later without repartitioning:

partitions = ceil( max(partitions_throughput, partitions_parallelism) * headroom )

A headroom factor of 1.5–2x is typical for topics whose load is expected to grow.

Worked example

Suppose a topic must sustain a target of 600 MB/s. Benchmarks show producers manage 60 MB/s per partition and consumers process 30 MB/s per partition. You also expect to run up to 16 consumer instances at peak, and you want 1.5x headroom.

partitions_throughput = max(600 / 60, 600 / 30) = max(10, 20) = 20
partitions_parallelism = 16
base = max(20, 16) = 20
partitions = ceil(20 * 1.5) = 30

So provision 30 partitions. Create the topic in KRaft mode:

kafka-topics.sh --create \
  --bootstrap-server localhost:9092 \
  --topic orders \
  --partitions 30 \
  --replication-factor 3

Output:

Created topic orders.

Adding partitions later is online and non-blocking, but it changes the hash(key) % partitions mapping, so existing keys may move to a new partition and per-key ordering across the change is not preserved. Size with headroom rather than relying on frequent expansion.

The cost of too many partitions

More partitions is not free. Each partition replica consumes resources on every broker that hosts it, and the controller must track and recover all of them.

CostEffect of high partition count
Open file handlesEach segment is a file; thousands of partitions can exhaust OS limits
Memory / page cacheMore partitions fragment page cache and producer/consumer buffers
Leader electionsOn broker failure the controller re-elects every affected leader; latency scales with partition count
End-to-end latencyReplication of more partitions adds tail latency under load
Producer batchingPer-partition batches shrink, hurting compression and throughput

A practical planning ceiling is to keep partitions per broker in the low thousands and to estimate cluster-wide partitions as partitions_per_topic * replication_factor summed across topics.

Producing with awareness of partitions

Throughput per partition improves when producers batch effectively. Tune batching so each partition’s batch fills before linger.ms expires:

Map<String, Object> props = new HashMap<>();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
props.put(ProducerConfig.BATCH_SIZE_CONFIG, 64 * 1024);   // 64 KB per partition batch
props.put(ProducerConfig.LINGER_MS_CONFIG, 10);
props.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "lz4");

try (Producer<String, String> producer = new KafkaProducer<>(props)) {
    producer.send(new ProducerRecord<>("orders", orderId, payload));
}

For Spring for Apache Kafka, match concurrency to partition count so every partition has a thread:

@Configuration
public class KafkaListenerConfig {

    @Bean
    ConcurrentKafkaListenerContainerFactory<String, String> kafkaListenerContainerFactory(
            ConsumerFactory<String, String> consumerFactory) {
        var factory = new ConcurrentKafkaListenerContainerFactory<String, String>();
        factory.setConsumerFactory(consumerFactory);
        factory.setConcurrency(10); // <= partitions on the topic
        return factory;
    }
}

Setting concurrency higher than the partition count just creates idle consumer threads. Cap it at the partition count, and remember that across multiple instances the total concurrency (instances x concurrency) must not exceed partitions either.

Best Practices

  • Size from both throughput and parallelism, take the maximum, and apply a 1.5–2x headroom factor for growth.
  • Always benchmark per-partition throughput on the slower side (usually the consumer) rather than assuming a number.
  • Prefer one well-sized topic to splitting load across many tiny topics — partitions, not topics, give you parallelism.
  • Keep partitions-per-broker in the low thousands; account for replication_factor when totalling cluster-wide partitions.
  • Avoid routine repartitioning: it disrupts key-based ordering and changes partition assignment, so plan headroom instead.
  • Match consumer-group concurrency (instances x threads) to partition count so no consumer sits idle and no partition is starved.
Last updated June 1, 2026
Was this helpful?