Skip to content
Apache Kafka kf serialization 4 min read

Protobuf Serialization

Protocol Buffers (Protobuf) is Google’s compact, strongly-typed binary serialization format, and it is a first-class citizen in the Confluent Kafka ecosystem alongside Avro. You define your message contract in a .proto file, generate fast Java classes from it, and let KafkaProtobufSerializer/KafkaProtobufDeserializer register and resolve schemas through a Schema Registry. For teams already using Protobuf for gRPC or internal RPC, reusing those definitions on Kafka means one contract spanning both the synchronous and event-driven sides of a system.

Defining a .proto schema

A Protobuf schema is a .proto file, conventionally kept under src/main/proto/. It declares a syntax, a package, and one or more message types with numbered fields. The field numbers — not the names — are what gets written on the wire, which is the foundation of Protobuf’s evolution model.

syntax = "proto3";

package com.devcraftly.events;

option java_package = "com.devcraftly.events";
option java_outer_classname = "OrderProto";
option java_multiple_files = true;

message OrderPlaced {
  string order_id = 1;
  string customer_id = 2;
  int64 amount_cents = 3;
  string currency = 4;
  int64 placed_at = 5;
}

Generating Java classes

Protobuf records are not hand-written; the protoc compiler turns each .proto into an immutable, builder-based Java class. With Maven the protobuf-maven-plugin binds to the generate-sources phase; with Gradle the com.google.protobuf plugin does the same.

<plugin>
  <groupId>org.xolstice.maven.plugins</groupId>
  <artifactId>protobuf-maven-plugin</artifactId>
  <version>0.6.1</version>
  <configuration>
    <protocArtifact>com.google.protobuf:protoc:3.25.5:exe:${os.detected.classifier}</protocArtifact>
  </configuration>
  <executions>
    <execution>
      <goals><goal>compile</goal></goals>
    </execution>
  </executions>
</plugin>

This produces com.devcraftly.events.OrderPlaced, a class with typed getters and a fluent newBuilder(). Add the Confluent serializer dependency (io.confluent:kafka-protobuf-serializer) and the confluent Maven repository so KafkaProtobufSerializer is on the classpath.

The wire format

Like Avro, Protobuf records on Kafka are not fully self-describing. The serializer registers the schema, receives an integer ID, and prefixes the payload with a 5-byte header. Protobuf adds one extra detail: a message-index varint that identifies which message type within the .proto was used, since a single file can declare several.

[ magic byte: 0x00 ][ 4-byte schema ID ][ message-index varint(s) ][ Protobuf payload ]

Configuring producer and consumer

Wire the Confluent serializers and point them at the registry. On the consumer, you must tell the deserializer which generated class to materialize via specific.protobuf.value.type; otherwise it returns a DynamicMessage.

spring:
  kafka:
    bootstrap-servers: localhost:9092
    producer:
      key-serializer: org.apache.kafka.common.serialization.StringSerializer
      value-serializer: io.confluent.kafka.serializers.protobuf.KafkaProtobufSerializer
      properties:
        schema.registry.url: http://localhost:8081
        auto.register.schemas: false
    consumer:
      key-deserializer: org.apache.kafka.common.serialization.StringDeserializer
      value-deserializer: io.confluent.kafka.serializers.protobuf.KafkaProtobufDeserializer
      properties:
        schema.registry.url: http://localhost:8081
        specific.protobuf.value.type: com.devcraftly.events.OrderPlaced

Producing and consuming

With the generated class in hand, the application code is straightforward. Build the message, send it, and Spring’s KafkaTemplate handles serialization through the configured serializer.

@Service
public class OrderProducer {

    private final KafkaTemplate<String, OrderPlaced> kafkaTemplate;

    public OrderProducer(KafkaTemplate<String, OrderPlaced> kafkaTemplate) {
        this.kafkaTemplate = kafkaTemplate;
    }

    public void publish(String orderId, String customerId, long amountCents) {
        OrderPlaced event = OrderPlaced.newBuilder()
                .setOrderId(orderId)
                .setCustomerId(customerId)
                .setAmountCents(amountCents)
                .setCurrency("USD")
                .setPlacedAt(System.currentTimeMillis())
                .build();
        kafkaTemplate.send("orders", orderId, event);
    }
}
@Component
public class OrderConsumer {

    @KafkaListener(topics = "orders", groupId = "billing")
    public void onOrder(OrderPlaced event) {
        System.out.printf("Order %s for customer %s: %d %s%n",
                event.getOrderId(), event.getCustomerId(),
                event.getAmountCents(), event.getCurrency());
    }
}

Output:

Order ord-9341 for customer cust-77: 4999 USD

Protobuf vs Avro

Both formats are compact, schema-backed, and registry-integrated. The choice usually comes down to your existing toolchain and how you think about evolution.

AspectProtobufAvro
Schema language.proto IDLJSON (.avsc)
Identity of fieldsField numbersField names
Code generationAlways (protoc)Generated or GenericRecord
Cross-protocol reuseShared with gRPCKafka-centric
Default handlingImplicit defaults (no null)Explicit defaults + unions
Evolution modelAdd fields with new numbersAdd fields with defaults

In proto3 there is no true null for scalar fields — an unset int64 reads back as 0 and an unset string as "". If you must distinguish “absent” from “zero”, use a wrapper type such as google.protobuf.Int64Value or optional (proto3.15+) rather than relying on the zero value.

Inspecting the registered schema

The default subject for a topic’s value is <topic>-value. Confirm Protobuf was registered with the correct type.

curl -s http://localhost:8081/subjects/orders-value/versions/latest | jq '.schemaType, .version'

Output:

"PROTOBUF"
1

Best Practices

  • Keep .proto files in version control as the contract source of truth and generate classes at build time; never edit generated code.
  • Never reuse or renumber an existing field tag — assign a new number for every added field and reserved retired ones to keep the wire format compatible.
  • Set auto.register.schemas: false in production and register schemas through CI so incompatible changes fail before deployment.
  • Set specific.protobuf.value.type on consumers to get typed messages instead of an untyped DynamicMessage.
  • Treat proto3 zero values deliberately: use wrapper types or optional when “unset” and “zero” must differ.
  • Reuse the same .proto definitions across gRPC and Kafka where it makes sense, but pin a per-subject compatibility mode (usually BACKWARD) so the registry enforces the rules you expect.
Last updated June 1, 2026
Was this helpful?