Skip to content
Apache Kafka kf getting-started 5 min read

Core Concepts & Glossary

Kafka has a small but dense vocabulary, and every term carries operational weight: misunderstanding what an offset or an ISR actually means is how teams end up with data loss, stuck consumers, or under-replicated partitions in production. This page is a quick-reference glossary of the core concepts you will meet on every other page of these docs. Each term gets one or two precise sentences; skim it now, then bookmark it for the moment a config key or a log line stops making sense.

Cluster and node terms

These describe the physical and logical layout of a Kafka deployment.

TermDefinition
ClusterA group of one or more cooperating brokers that share the same metadata and together store all topics.
BrokerA single Kafka server process that stores partition data on disk and serves produce/fetch requests. Each broker has a unique numeric node.id.
ControllerThe broker (or dedicated node) that owns cluster metadata — topic creation, partition assignment, leader election. In modern Kafka, controllers form a Raft quorum (KRaft).
KRaftKafka Raft — the built-in consensus protocol that stores metadata in an internal __cluster_metadata log, replacing ZooKeeper. The default since Kafka 3.3 and the only mode from 4.0 onward.
ZooKeeperThe legacy external coordination service Kafka used to store metadata before KRaft. Removed entirely in Kafka 4.0; you should not deploy it for new clusters.

KRaft vs ZooKeeper is the single biggest architectural shift in Kafka’s history. If you are starting fresh, run KRaft. Only touch ZooKeeper concepts when migrating or maintaining a pre-3.x cluster.

Topic and storage terms

This group covers how records are organized and durably stored.

TermDefinition
TopicA named, append-only category of records (e.g. orders). Topics are logical; the physical unit is the partition.
PartitionAn ordered, immutable, append-only log that is the unit of parallelism and ordering. A topic is split into one or more partitions, each identified as topic-N.
OffsetA monotonically increasing 64-bit integer that uniquely identifies a record’s position within a partition. Offsets are per-partition, never global.
ReplicaA copy of a partition stored on a broker. The replication.factor controls how many copies exist; replicas are how Kafka survives broker failure.
LeaderThe single replica of a partition that handles all reads and writes at a given time. Producers and consumers always talk to the leader.
FollowerA replica that passively fetches records from the leader to stay in sync. A follower is promoted to leader if the current leader fails.
ISRIn-Sync Replicas — the set of replicas (leader + followers) that are fully caught up with the leader. Only ISR members are eligible to become leader.
High watermarkThe highest offset that has been replicated to all ISR members. Consumers can only read up to the high watermark, which guarantees they never see unreplicated (potentially lost) data.
RetentionThe policy that decides when old records are deleted, by time (retention.ms) or size (retention.bytes).
Log compactionA retention mode (cleanup.policy=compact) that keeps only the latest record per key, ideal for changelog/state topics rather than time-bounded event streams.

You can inspect a partition’s leader and ISR directly:

kafka-topics.sh --bootstrap-server localhost:9092 \
  --describe --topic orders

Output:

Topic: orders   PartitionCount: 3   ReplicationFactor: 3
  Topic: orders   Partition: 0   Leader: 1   Replicas: 1,2,3   Isr: 1,2,3
  Topic: orders   Partition: 1   Leader: 2   Replicas: 2,3,1   Isr: 2,3,1
  Topic: orders   Partition: 2   Leader: 3   Replicas: 3,1,2   Isr: 3,1

A partition whose Isr is smaller than its Replicas is under-replicated — a follower has fallen behind or its broker is down. Watch UnderReplicatedPartitions as a top-tier alert.

Client terms

These describe the applications that read and write data, and how they coordinate.

TermDefinition
ProducerA client that publishes records to topic partitions. Partition choice is driven by the record key (hash) or a custom partitioner.
ConsumerA client that subscribes to topics and reads records in offset order, committing its progress so it can resume after a restart.
Consumer groupA set of consumers sharing a group.id that cooperatively divide a topic’s partitions, with each partition consumed by exactly one member at a time. This is how you scale consumption horizontally.
RebalanceThe process of reassigning partitions across a consumer group’s members when a consumer joins, leaves, or fails. During a rebalance, consumption briefly pauses.
LagThe difference between the latest offset in a partition and a consumer group’s committed offset — i.e. how far behind a consumer is. The key health metric for any consumer.
acksThe producer durability setting: acks=0 (fire-and-forget), acks=1 (leader only), acks=all (all ISR confirmed). Use acks=all whenever you cannot afford to lose records.

A minimal producer config showing the durability and identity keys above:

var props = new Properties();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
props.put(ProducerConfig.ACKS_CONFIG, "all");          // wait for full ISR
props.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, true);
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class);

try (var producer = new KafkaProducer<String, String>(props)) {
    producer.send(new ProducerRecord<>("orders", "order-42", "{\"id\":42}"));
}

To check a consumer group’s lag from the CLI:

kafka-consumer-groups.sh --bootstrap-server localhost:9092 \
  --describe --group order-processor

Best practices

  • Treat offsets as per-partition: never assume ordering or numbering carries across partitions of a topic.
  • Run acks=all plus min.insync.replicas=2 with replication.factor=3 for any topic you cannot afford to lose.
  • Monitor consumer lag and under-replicated partitions as primary SLO signals; both surface problems before users notice.
  • Keep rebalances rare and fast by using cooperative-sticky assignment and tuning session.timeout.ms / max.poll.interval.ms to your real workload.
  • Choose retention vs. compaction deliberately: time/size retention for event streams, compaction for keyed state and changelogs.
  • Deploy new clusters on KRaft; do not introduce ZooKeeper into any greenfield system.
Last updated June 1, 2026
Was this helpful?