String & Byte Serdes
Every record that travels through Kafka is ultimately a pair of byte arrays — one for the key and one for the value. The serializers and deserializers (collectively serdes) you configure decide how your Java objects become those bytes and back again. Kafka ships a handful of built-in serdes for the most common primitive types, and reaching for them first — before pulling in JSON, Avro, or Protobuf — keeps your pipelines lean, fast, and free of extra dependencies when the payload is genuinely simple.
The built-in serdes
The org.apache.kafka:kafka-clients library ships ready-made serializer and deserializer classes in the org.apache.kafka.common.serialization package. They are the same classes whether you use the raw producer/consumer API or Spring for Apache Kafka, since Spring simply delegates to them.
| Type | Serializer | Deserializer | Wire format |
|---|---|---|---|
String | StringSerializer | StringDeserializer | UTF-8 bytes (encoding configurable) |
byte[] | ByteArraySerializer | ByteArrayDeserializer | raw bytes (pass-through) |
ByteBuffer | ByteBufferSerializer | ByteBufferDeserializer | raw bytes |
Integer | IntegerSerializer | IntegerDeserializer | 4-byte big-endian |
Long | LongSerializer | LongDeserializer | 8-byte big-endian |
Short | ShortSerializer | ShortDeserializer | 2-byte big-endian |
Float | FloatSerializer | FloatDeserializer | 4-byte IEEE 754 |
Double | DoubleSerializer | DoubleDeserializer | 8-byte IEEE 754 |
UUID | UUIDSerializer | UUIDDeserializer | UTF-8 text of toString() |
Void | VoidSerializer | VoidDeserializer | always null |
For Kafka Streams the equivalents live in org.apache.kafka.common.serialization.Serdes, which bundles a matching serializer/deserializer pair: Serdes.String(), Serdes.Long(), Serdes.ByteArray(), Serdes.UUID(), and so on.
Note that
UUIDSerializerwrites the textual form (36 characters), not the 16 raw bytes. If you need the compact binary representation you must serialize the twolonghalves yourself.
Configuring the raw producer and consumer
With the plain client API you set the serdes through the key.serializer / value.serializer (producer) and key.deserializer / value.deserializer (consumer) properties. The example below keys records by a String and sends a Long event count.
import org.apache.kafka.clients.producer.KafkaProducer;
import org.apache.kafka.clients.producer.ProducerConfig;
import org.apache.kafka.clients.producer.ProducerRecord;
import org.apache.kafka.common.serialization.LongSerializer;
import org.apache.kafka.common.serialization.StringSerializer;
import java.util.Properties;
Properties props = new Properties();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, LongSerializer.class.getName());
try (KafkaProducer<String, Long> producer = new KafkaProducer<>(props)) {
producer.send(new ProducerRecord<>("page-views", "home", 42L));
producer.flush();
}
The matching consumer just swaps in the deserializers:
import org.apache.kafka.clients.consumer.ConsumerConfig;
import org.apache.kafka.common.serialization.LongDeserializer;
import org.apache.kafka.common.serialization.StringDeserializer;
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, LongDeserializer.class.getName());
Configuring serdes in Spring Boot
In a Spring Boot 3.x application the same classes are wired declaratively through application.yml. Spring’s auto-configured KafkaTemplate and listener containers pick these up automatically.
spring:
kafka:
bootstrap-servers: localhost:9092
producer:
key-serializer: org.apache.kafka.common.serialization.StringSerializer
value-serializer: org.apache.kafka.common.serialization.StringSerializer
consumer:
group-id: analytics
key-deserializer: org.apache.kafka.common.serialization.StringDeserializer
value-deserializer: org.apache.kafka.common.serialization.StringDeserializer
Producing a plain String value is then a one-liner:
import org.springframework.kafka.core.KafkaTemplate;
import org.springframework.stereotype.Service;
@Service
public class EventPublisher {
private final KafkaTemplate<String, String> kafkaTemplate;
public EventPublisher(KafkaTemplate<String, String> kafkaTemplate) {
this.kafkaTemplate = kafkaTemplate;
}
public void publish(String userId, String rawPayload) {
kafkaTemplate.send("events", userId, rawPayload);
}
}
Choosing the String encoding
StringSerializer and StringDeserializer default to UTF-8, but the encoding is configurable via the key.serializer.encoding, value.serializer.encoding, or the generic serializer.encoding property. Both ends must agree — a mismatch silently corrupts non-ASCII characters rather than throwing.
value.serializer=org.apache.kafka.common.serialization.StringSerializer
value.serializer.encoding=UTF-8
You can confirm what is actually on a topic with the console consumer, which deserializes as String by default:
kafka-console-consumer.sh \
--bootstrap-server localhost:9092 \
--topic events \
--from-beginning
Output:
{"loginAttempt":true}
home
order-123
When raw String or bytes is enough
Built-in serdes shine when the payload is naturally a primitive or when you are deliberately treating the body as opaque:
- Keys. Partitioning keys are almost always a
String,Long, orUUID. UsingStringSerializerfor the key while a richer serde handles the value is a very common and recommended pattern. - Metrics and counters. A topic carrying
Longtimestamps or counts needs nothing more thanLongSerializer. - Pass-through pipelines. When a service routes, mirrors, or buffers messages without inspecting them,
ByteArraySerializeravoids any deserialize/re-serialize round trip and preserves the bytes exactly. - Pre-encoded payloads. If an upstream system already produced compressed or encrypted bytes, treat them as
byte[]and stay out of the way.
You should reach for a structured format instead once the value has multiple fields, needs to evolve over time, or must be validated by independent producers and consumers. A free-form String of JSON works for prototypes, but it carries no schema, no compatibility guarantees, and no compile-time type safety — exactly the problems that JSON, Avro, and Protobuf serdes plus a Schema Registry exist to solve.
A
Stringcontaining hand-built JSON is a frequent source of production incidents: one team renames a field, deserialization on the other side keeps “working,” and data is silently dropped. If your payload is structured, use a real schema rather thanStringSerializer.
Best Practices
- Keep keys simple — prefer
String,Long, orUUIDserdes so partition assignment stays predictable and debuggable. - Pin the String encoding (UTF-8) explicitly on both producer and consumer to avoid locale-dependent corruption.
- Use
ByteArraySerializerfor true pass-through routing; do not deserialize and re-serialize bytes you never inspect. - Remember
UUIDSerializeremits 36-character text, not 16 bytes — measure the size impact before using it on high-volume keys. - Reserve raw
String/JSON values for prototypes; graduate structured payloads to Avro or Protobuf with a Schema Registry before they reach production. - Always configure matching serializer and deserializer types end to end; a
Longwritten withLongSerializeris unreadable byStringDeserializer.