Static Group Membership

Every time a consumer leaves a group, Kafka triggers a rebalance to redistribute partitions among the survivors. During normal operations like rolling restarts or brief network blips this is wasteful: the consumer comes back almost immediately, yet the group has already paid the cost of stopping all members, reassigning partitions, and resuming. Static group membership, introduced in Kafka 2.3 (KIP-345), lets a consumer keep a stable identity across restarts so that short absences no longer cause a rebalance. In production this dramatically reduces consumption pauses and stabilises latency-sensitive pipelines.

How dynamic membership behaves

In the default dynamic model, the group coordinator assigns each consumer an ephemeral member ID when it joins. When that consumer shuts down, it sends a LeaveGroup request (or simply stops sending heartbeats), and the coordinator immediately removes it and rebalances. On restart the consumer is treated as a brand-new member with a fresh ID, triggering a second rebalance when it rejoins.

A rolling restart of a 10-instance deployment can therefore cause up to 20 rebalances. Each rebalance is a stop-the-world event for the group: every consumer revokes its partitions and waits until assignment completes before processing resumes.

How static membership works

Static membership assigns each consumer a durable identity through the group.instance.id configuration. The coordinator maps this stable ID to the member’s partition assignment and persists it across reconnections for the duration of session.timeout.ms.

Two behaviours change as a result:

Graceful shutdown does not send LeaveGroup. When a static member stops, the coordinator keeps its slot reserved instead of rebalancing.
Rejoining reuses the previous assignment. As long as the member returns before session.timeout.ms expires, it reclaims its exact previous partitions with no rebalance.

If the member stays away longer than the session timeout, the coordinator finally expires it and rebalances normally — so static membership defers, but does not eliminate, rebalancing for genuine failures.

Configuration

Each consumer instance must have a unique, stable group.instance.id. Reusing the same value on two live members causes a fatal FencedInstanceIdException. Pair it with a session.timeout.ms large enough to cover your restart window.

group.id=order-processors
group.instance.id=order-processor-3
session.timeout.ms=45000
heartbeat.interval.ms=10000
max.poll.interval.ms=300000

With the plain Java client:

Properties props = new Properties();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "broker:9092");
props.put(ConsumerConfig.GROUP_ID_CONFIG, "order-processors");
props.put(ConsumerConfig.GROUP_INSTANCE_ID_CONFIG, "order-processor-3");
props.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, 45000);
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,
        StringDeserializer.class.getName());
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,
        StringDeserializer.class.getName());

try (KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props)) {
    consumer.subscribe(List.of("orders"));
    while (true) {
        ConsumerRecords<String, String> records =
                consumer.poll(Duration.ofMillis(500));
        records.forEach(r -> process(r.value()));
    }
}

Spring Boot configuration

In Spring for Apache Kafka the property maps directly. The critical requirement is that each pod gets a distinct value — derive it from a stable ordinal such as a Kubernetes StatefulSet pod name.

spring:
  kafka:
    consumer:
      group-id: order-processors
      properties:
        group.instance.id: ${HOSTNAME}      # e.g. order-processor-3 in a StatefulSet
        session.timeout.ms: 45000
        heartbeat.interval.ms: 10000

@Component
public class OrderListener {

    @KafkaListener(topics = "orders", groupId = "order-processors")
    public void onMessage(OrderEvent event) {
        // process the event; assignment survives a quick pod restart
    }
}

public record OrderEvent(String orderId, String status, long amountCents) {}

Tip: Set session.timeout.ms comfortably above your slowest restart — for container deployments allow for image pull, JVM warm-up, and readiness checks. A common range is 30s–60s. The broker also caps it between group.min.session.timeout.ms and group.max.session.timeout.ms.

Static vs dynamic membership

Aspect	Dynamic membership	Static membership
Identity	Ephemeral member ID per join	Stable `group.instance.id`
Graceful shutdown	Sends `LeaveGroup`, rebalances	No `LeaveGroup`, slot reserved
Rolling restart	Rebalance per stop and per start	No rebalance if back within session timeout
Transient disconnect	Rebalance after session timeout	Same assignment reclaimed
Duplicate ID	N/A	`FencedInstanceIdException`
Best for	Elastic, frequently scaling groups	Stable, long-lived consumers

Operational benefit

The payoff is fewer, shorter consumption stalls. Verify membership type using the consumer-groups CLI; static members show their group.instance.id in the HOST/CLIENT-ID listing.

kafka-consumer-groups.sh --bootstrap-server broker:9092 \
  --describe --group order-processors --members --verbose

Output:

GROUP            CONSUMER-ID                        HOST         CLIENT-ID  #PARTITIONS  ASSIGNMENT
order-processors order-processor-3-a1b2c3...        /10.0.0.13   ...        4            orders(0,1,2,3)

Restart that pod and re-run the command: the assignment column stays identical, and group lag barely moves instead of spiking during a rebalance.

Best practices

Derive group.instance.id from a stable, unique source (StatefulSet pod name, ordinal index) — never a random UUID, which would defeat the purpose.
Size session.timeout.ms to exceed your realistic restart duration, but not so high that real failures go undetected for too long.
Keep heartbeat.interval.ms at roughly one third of the session timeout so liveness is still detected promptly.
Combine static membership with cooperative rebalancing for the smoothest behaviour when a genuine rebalance is unavoidable.
Watch for FencedInstanceIdException in logs — it means two live instances share an ID; fix your ID assignment immediately.
Remember that scaling the group up or down still rebalances; static membership only suppresses rebalances for transient absences.