Skip to content
Apache Kafka kf producers 5 min read

Producer Overview

The Kafka producer is the client responsible for publishing records to topics, and understanding its internal pipeline is the difference between a system that quietly drops data and one that delivers millions of messages per second reliably. A send() call is not a synchronous network round-trip; it is an enqueue into a background machine that serializes, partitions, batches, and ships records on your behalf. Knowing how that machine works lets you reason about latency, throughput, ordering, and durability instead of guessing. This page walks the full path a record takes from your code to a broker acknowledgement.

The send pipeline

When you call producer.send(record), the record flows through a fixed sequence of stages before it ever touches the network:

ProducerRecord
   |
   v
[ Serializer ]      key + value -> byte[]
   |
   v
[ Partitioner ]    choose target partition
   |
   v
[ Record Accumulator ]   append to per-partition batch (in memory)
   |
   v
[ Sender (I/O) thread ]  drain ready batches, group by broker
   |
   v
[ Broker ]         append to log, replicate, return ack
   |
   v
[ Callback / Future ]    RecordMetadata or exception

The key insight is that two threads are involved. Your application thread performs serialization and partitioning, then appends the record to an in-memory buffer and returns immediately. A separate background sender thread (the I/O thread) drains that buffer and talks to the brokers. This decoupling is what makes the producer fast: your code is never blocked on broker latency unless the buffer fills up.

ProducerRecord

A ProducerRecord is the unit you publish. At minimum it carries a topic and a value, but it can also carry a key, an explicit partition, a timestamp, and headers.

FieldRequiredPurpose
topicYesDestination topic name
valueYes (may be null for tombstones)The payload
keyNoDrives default partitioning and log compaction
partitionNoForces a specific partition, bypassing the partitioner
timestampNoEvent time; defaults to send time
headersNoArbitrary metadata key/value pairs

The key matters more than it first appears: records with the same key are routed to the same partition by the default partitioner, which is how Kafka preserves per-key ordering.

Serialization and partitioning

Before a record can be batched it must become bytes. The producer applies the configured key.serializer and value.serializer, then hands the serialized record to the partitioner. If you set a partition explicitly it is honored; otherwise the default partitioner hashes the key (or, for null keys, uses a sticky round-robin strategy that fills one batch at a time for better batching efficiency).

The record accumulator and sender thread

The record accumulator is a pool of memory (sized by buffer.memory, default 32 MB) organized as a queue of batches per topic-partition. New records append to the current open batch for their partition. A batch becomes eligible to send when it is full (batch.size) or when its linger.ms timer expires. The sender thread continuously drains ready batches, groups them by destination broker into a single request, and writes them to the network. This batching is the single biggest throughput lever in Kafka: fewer, larger requests amortize network and broker overhead across many records.

The producer trades a small amount of latency for throughput. Raising linger.ms lets more records accumulate per batch, increasing throughput at the cost of a few extra milliseconds of delay.

A basic producer

The following plain-client example sends three records asynchronously with a callback and then closes cleanly.

import org.apache.kafka.clients.producer.KafkaProducer;
import org.apache.kafka.clients.producer.ProducerConfig;
import org.apache.kafka.clients.producer.ProducerRecord;
import org.apache.kafka.common.serialization.StringSerializer;

import java.util.Properties;

public class OrderProducer {
    public static void main(String[] args) {
        Properties props = new Properties();
        props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
        props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
        props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
        props.put(ProducerConfig.ACKS_CONFIG, "all");

        try (KafkaProducer<String, String> producer = new KafkaProducer<>(props)) {
            for (int i = 1; i <= 3; i++) {
                var record = new ProducerRecord<>("orders", "order-" + i, "amount=" + (i * 100));
                producer.send(record, (metadata, exception) -> {
                    if (exception != null) {
                        System.err.println("Send failed: " + exception.getMessage());
                    } else {
                        System.out.printf("Sent to %s-%d @ offset %d%n",
                                metadata.topic(), metadata.partition(), metadata.offset());
                    }
                });
            }
            // try-with-resources calls close(), which flushes pending batches
        }
    }
}

Output:

Sent to orders-0 @ offset 14
Sent to orders-2 @ offset 9
Sent to orders-1 @ offset 22

Notice the callbacks may fire out of order across partitions, since each partition is acknowledged independently. The RecordMetadata returned in the callback tells you exactly where the record landed, which is invaluable for auditing and debugging.

Acknowledgements

The acks setting controls when a send is considered successful. It is the primary knob for the durability-versus-latency trade-off.

acksWaits forDurabilityLatency
0Nothing (fire and forget)Lowest — may lose dataLowest
1Leader write onlyModerate — loses data if leader fails before replicationLow
allLeader + all in-sync replicasHighestHigher

For most production systems acks=all combined with a replication factor of at least 3 and min.insync.replicas=2 is the durable baseline. Only the acknowledgement of the broker — not the send() return — confirms that a record is safely stored.

Calling send() and ignoring the returned Future or callback is the most common cause of silent data loss. Always handle the result so failed sends surface instead of vanishing.

Best Practices

  • Treat send() as asynchronous: always supply a callback (or check the Future) so failures are observed rather than swallowed.
  • Use a meaningful key when per-key ordering matters; records with the same key stay on the same partition.
  • Tune batch.size and linger.ms together to balance throughput against latency for your workload.
  • Use acks=all with replication.factor>=3 and min.insync.replicas=2 for durable, production-grade delivery.
  • Reuse a single KafkaProducer instance across threads — it is thread-safe and sharing it maximizes batching.
  • Always close the producer (or use try-with-resources) so buffered batches are flushed before the process exits.
Last updated June 1, 2026
Was this helpful?