MirrorMaker 2

MirrorMaker 2 (MM2) is Kafka’s built-in tool for replicating data between clusters — across data centers, regions, or cloud accounts. Unlike the original MirrorMaker, MM2 is built on the Kafka Connect framework, so it inherits Connect’s scalability, fault tolerance, and offset tracking. Beyond copying messages, MM2 synchronizes topic configurations, ACLs, and — critically — translates consumer offsets so applications can fail over to a remote cluster and resume roughly where they left off. This makes it the foundation for disaster recovery and geo-distributed architectures.

How MirrorMaker 2 works

MM2 runs as a set of Connect connectors that move data and metadata between a source cluster and a target cluster:

Connector	Responsibility
`MirrorSourceConnector`	Copies topic records and topic configs from source to target
`MirrorCheckpointConnector`	Emits consumer-group offset checkpoints and translates them for the target
`MirrorHeartbeatConnector`	Produces periodic heartbeats to measure replication health and lag

Each connector is identified by a replication flow written as source->target. You can run many flows at once — primary->backup, backup->primary, us-east->eu-west — and MM2 manages them independently within the same cluster of worker processes.

Remote topic naming

To avoid loops and name collisions, MM2 prefixes replicated topics with the source cluster alias by default. A topic orders on cluster primary, replicated to backup, appears on backup as primary.orders. This DefaultReplicationPolicy makes the data lineage obvious and lets bidirectional replication coexist without one flow re-copying the other’s output.

primary cluster                 backup cluster
---------------                 --------------
orders            --MM2-->      primary.orders
payments          --MM2-->      primary.payments

If you need flat topic names (so orders stays orders on the target), use the IdentityReplicationPolicy. It is convenient for one-way active-passive setups but is unsafe for active-active, because two flows can form an infinite replication loop.

Configuration example

MM2 is configured with a single properties file passed to connect-mirror-maker.sh. Define the clusters, the bootstrap servers for each, and which flows are enabled.

# mm2.properties — replicate primary -> backup

clusters = primary, backup

primary.bootstrap.servers = primary-kafka:9092
backup.bootstrap.servers  = backup-kafka:9092

# enable the replication flow primary -> backup
primary->backup.enabled = true
primary->backup.topics  = orders|payments|inventory

# keep the reverse flow off for active-passive
backup->primary.enabled = false

# sync topic configs and consumer offsets
sync.topic.configs.enabled        = true
sync.group.offsets.enabled        = true
emit.checkpoints.enabled          = true
refresh.topics.interval.seconds   = 30
replication.factor                = 3
checkpoints.topic.replication.factor = 3
offset-syncs.topic.replication.factor = 3
heartbeats.topic.replication.factor  = 3

Run it as a dedicated MM2 cluster (it self-manages Connect internally):

connect-mirror-maker.sh mm2.properties

Output:

INFO Starting with 1 enabled replication flows: [primary->backup]
INFO [MirrorSourceConnector|task-0] Starting with 3 topic-partitions
INFO [MirrorCheckpointConnector] Syncing offsets for groups: [order-service]
INFO Mirroring topics: primary.orders, primary.payments, primary.inventory

Config and ACL sync

MM2 keeps target topics consistent with their source automatically. When you add partitions to orders on primary, the MirrorSourceConnector propagates the change to primary.orders on backup within refresh.topics.interval.seconds. Topic-level configs (retention, cleanup policy, compression) are mirrored when sync.topic.configs.enabled is true. ACL replication is opt-in via sync.topic.acls.enabled so principals retain the same permissions after failover. Note that internal MM2 topics and configs in config.properties.exclude are never copied.

Consumer offset translation

Raw offsets are meaningless across clusters because partition offsets diverge during replication. MM2 solves this with the MirrorCheckpointConnector, which records the mapping between source offsets and the corresponding target offsets in an offset-syncs topic, then emits checkpoints that translate committed consumer-group offsets into target-cluster offsets.

With sync.group.offsets.enabled = true, MM2 writes translated offsets directly into the target’s __consumer_offsets. After failover, a consumer group simply connects to backup, subscribes to primary.orders, and resumes near where it stopped on the source — avoiding both reprocessing the entire topic and silently skipping records.

# inspect translated offsets for a group on the target cluster
kafka-consumer-groups.sh --bootstrap-server backup-kafka:9092 \
  --describe --group order-service

Offset translation is approximate. Applications must still be idempotent or tolerant of small amounts of duplicate processing, because the translated offset can land slightly before the true position.

Active-passive vs active-active

These are the two canonical topologies, and the choice drives your replication policy and flow configuration.

Aspect	Active-passive	Active-active
Traffic	All writes to one cluster; other is standby	Writes to both clusters
Flows	One direction (`primary->backup`)	Both directions (`primary->backup` and `backup->primary`)
Replication policy	`Identity` or `Default`	Must use `DefaultReplicationPolicy`
Failover	Promote standby, repoint clients	Clients already use the nearest cluster
Use case	Disaster recovery	Geo-locality, regional read/write

In active-active, each cluster holds both its local topics and the remote-prefixed copies from the other cluster. A consumer that wants all events subscribes with a pattern such as orders|.*\.orders to read both local and replicated streams.

Best Practices

Always run MM2 as its own dedicated cluster of workers, sized independently from your brokers, so replication load never competes with production traffic.
Co-locate MM2 with the target cluster’s region; a remote consume + local produce pattern is more resilient to WAN hiccups than the reverse.
Use DefaultReplicationPolicy (cluster-prefixed names) for any bidirectional setup to prevent replication loops.
Enable emit.heartbeats and scrape the heartbeat/checkpoint topics or JMX metrics to alert on replication lag before it becomes a recovery-point problem.
Pre-create internal topics (offset-syncs, checkpoints, heartbeats) with RF ≥ 3 and protect them with ACLs just like business topics.
Keep consumer applications idempotent — offset translation is best-effort, so design for at-least-once semantics after failover.
Test failover regularly: promote the standby, repoint a real consumer group, and confirm it resumes from the translated offset rather than the topic head.