Transactional Outbox Pattern

When a service must both persist state and publish an event about it, you face a subtle but dangerous trap: a database commit and a Kafka publish are two independent operations that cannot share a single atomic transaction. If one succeeds and the other fails, your system drifts into an inconsistent state — an order saved with no OrderPlaced event, or an event broadcast for a row that was rolled back. The transactional outbox pattern eliminates this by writing the event into the same database transaction as the business change, then relaying it to Kafka asynchronously. It is the canonical fix for the dual-write problem and a foundation of reliable event-driven systems.

The dual-write problem

Consider the naive approach: inside a service method you save an entity and then call kafkaTemplate.send(...). There is no transaction spanning both the relational database and the broker, so any of these failure modes corrupts your data:

The DB commits, then the app crashes before the Kafka send — the event is lost.
The Kafka send succeeds, then the DB transaction rolls back — you published a phantom event for state that never persisted.
Kafka is briefly unavailable, so you retry the send inside the transaction, holding DB locks and bloating latency.

You cannot solve this with ordering tricks. Sending before commit risks phantom events; sending after commit risks lost events. Two-phase commit (XA) across a database and Kafka is poorly supported, slow, and operationally fragile. The outbox pattern sidesteps the impossibility entirely by making the event part of the same local transaction as the data.

How the outbox works

You add an outbox table to the service’s own database. Within the single business transaction, you insert one row per event alongside your domain writes. Because both writes hit the same database, they commit or roll back together — atomicity is guaranteed by the database, not by a distributed protocol. A separate relay process then reads new outbox rows and publishes them to Kafka, marking them as sent (or relying on the log position) once delivered.

┌─────────────────────────────────────────────┐
│  Single DB transaction                       │
│  ┌──────────────┐      ┌──────────────────┐  │
│  │ INSERT order │      │ INSERT outbox row│  │
│  └──────────────┘      └──────────────────┘  │
└─────────────────────────────────────────────┘
                  │ commit
                  ▼
        ┌───────────────────┐
        │   outbox table    │
        └─────────┬─────────┘
                  │  CDC (Debezium) or poller
                  ▼
        ┌───────────────────┐
        │       Kafka       │ ──► consumers
        └───────────────────┘

The relay delivers at-least-once. Consumers must therefore be idempotent (dedupe on the event id), which is standard practice in Kafka systems anyway.

The outbox schema

Keep the table generic so any aggregate can use it. The aggregate_type typically maps to a topic and aggregate_id to the partition key.

CREATE TABLE outbox (
    id             UUID         PRIMARY KEY,
    aggregate_type VARCHAR(255) NOT NULL,
    aggregate_id   VARCHAR(255) NOT NULL,
    event_type     VARCHAR(255) NOT NULL,
    payload        JSONB        NOT NULL,
    created_at     TIMESTAMPTZ  NOT NULL DEFAULT now()
);

CREATE INDEX idx_outbox_created_at ON outbox (created_at);

Writing to the outbox in Spring Boot

The key discipline is that the outbox insert lives inside the @Transactional boundary of the business operation — never in a separate transaction.

public record OrderPlaced(String orderId, String customerId, BigDecimal total) {}

@Service
public class OrderService {

    private final OrderRepository orders;
    private final OutboxRepository outbox;
    private final ObjectMapper mapper;

    public OrderService(OrderRepository orders, OutboxRepository outbox, ObjectMapper mapper) {
        this.orders = orders;
        this.outbox = outbox;
        this.mapper = mapper;
    }

    @Transactional
    public Order placeOrder(NewOrderRequest req) throws JsonProcessingException {
        Order order = orders.save(Order.from(req));

        OrderPlaced event = new OrderPlaced(
                order.getId(), order.getCustomerId(), order.getTotal());

        outbox.save(new OutboxEntry(
                UUID.randomUUID(),
                "order",                 // aggregate_type -> topic
                order.getId(),           // aggregate_id   -> partition key
                "OrderPlaced",
                mapper.writeValueAsString(event)));

        return order; // both rows commit atomically here
    }
}

Relaying outbox rows to Kafka

Option A: CDC with Debezium (recommended)

Debezium tails the database’s write-ahead log and emits a Kafka event for every committed outbox insert — no polling, sub-second latency, and zero load on your service. Its purpose-built outbox event router SMT unwraps the row into a clean event keyed by aggregate_id and routed by aggregate_type.

{
  "name": "order-outbox-connector",
  "config": {
    "connector.class": "io.debezium.connector.postgresql.PostgresConnector",
    "database.hostname": "postgres",
    "database.dbname": "orders",
    "table.include.list": "public.outbox",
    "transforms": "outbox",
    "transforms.outbox.type": "io.debezium.transforms.outbox.EventRouter",
    "transforms.outbox.route.by.field": "aggregate_type",
    "transforms.outbox.route.topic.replacement": "${routedByValue}.events"
  }
}

Because CDC reads the log, you never need an UPDATE to mark rows as sent — the log offset is the progress marker. A periodic job can prune old rows.

Option B: Polling publisher

If CDC is not an option, a scheduled poller reads unsent rows, publishes them, and deletes (or flags) them. Use FOR UPDATE SKIP LOCKED so multiple instances can poll concurrently without double-publishing.

@Component
public class OutboxRelay {

    private final OutboxRepository outbox;
    private final KafkaTemplate<String, String> kafka;

    public OutboxRelay(OutboxRepository outbox, KafkaTemplate<String, String> kafka) {
        this.outbox = outbox;
        this.kafka = kafka;
    }

    @Scheduled(fixedDelay = 500)
    @Transactional
    public void relay() {
        for (OutboxEntry e : outbox.findBatchForUpdate(100)) {
            kafka.send(e.getAggregateType() + ".events",
                       e.getAggregateId(), e.getPayload());
            outbox.delete(e);
        }
    }
}

Output:

Published OrderPlaced id=7f3a... key=order-1042 topic=order.events
Published OrderShipped id=9b21... key=order-1042 topic=order.events

Polling trades latency and DB load for simplicity. For high throughput or low-latency requirements, prefer CDC — the log-based relay scales far better and never competes with your application’s queries.

Why it beats chained transactions

A tempting alternative is to send to Kafka in an afterCommit callback (a “chained” or best-effort publish). This still loses events if the app crashes between commit and send — exactly the failure the outbox prevents. Another anti-pattern, XA/two-phase commit, makes Kafka and the DB joint participants in a global transaction; it serializes throughput, requires a transaction coordinator, and recovers poorly. The outbox needs only ordinary local transactions plus an idempotent relay.

Approach	Atomic?	Latency	Operational cost
Send inside `@Transactional`	No (phantom events)	Low	Low
Send in `afterCommit`	No (lost on crash)	Low	Low
XA / two-phase commit	Yes	High	High
Outbox + polling	Yes	Medium	Low
Outbox + CDC	Yes	Low	Medium

Best Practices

Always insert the outbox row inside the same transaction as the business write — a separate transaction reintroduces the dual-write problem.
Make consumers idempotent by deduplicating on the event id; the relay is at-least-once.
Prefer CDC/Debezium over polling for latency and to avoid loading the operational database with relay queries.
Store a stable event id and use aggregate_id as the Kafka key to preserve per-aggregate ordering within a partition.
For pollers, use SELECT ... FOR UPDATE SKIP LOCKED so multiple instances scale horizontally without duplicate publishes.
Prune delivered rows on a schedule (or via a TTL job) to keep the outbox table small and indexes fast.
Version your event payloads from day one so schema evolution does not break downstream consumers.