Enterprise Integration Patterns

Implementing Request-Reply with Apache Kafka: A Case Study in Digital Transformation

Anuraag Verma | Engineering Leadership in Wealth Management Technology

In the world of distributed systems, integrating services effectively is as critical as the services themselves. While standard design patterns get most of the attention, Enterprise Integration Patterns (EIP)—popularized by Gregor Hohpe and Bobby Woolf—provide a timeless framework for building robust, scalable messaging architectures.

This post is a deep dive into a real-world implementation of the Request-Reply pattern using Apache Kafka. The problem: a high-scale data migration and document search challenge in the wealth management sector. The solution required us to think carefully about message correlation, error resilience, and system-level throughput—all while maintaining sub-second search performance for advisors serving high-net-worth clients.

The Challenge: Document Search at Scale

The goal was simple but ambitious: deliver a sub-one-second search experience across more than 30 million documents distributed across multiple non-performant legacy data stores. Advisors needed instant access to client documents—statements, trade confirmations, correspondence—and the existing infrastructure couldn't keep up.

The Strategy

We chose a digital transformation path that involved building a centralized, search-optimized data store using Spring Boot, MongoDB, and Apache Kafka. The architectural philosophy was intentionally asymmetric: we were willing to accept a "write penalty" during ingestion to ensure that the "read" path—the experience advisors actually feel—remained lightning-fast.

Design Principle

In systems where read latency directly impacts user experience but writes can be eventually consistent, optimizing for reads at the cost of write complexity is a deliberate and worthwhile trade-off. This asymmetry is the foundation of our architecture.

Mastering the Request-Reply Pattern

The Request-Reply pattern is essential when a process needs to observe the result of an asynchronous action. In a synchronous HTTP world, request-reply is implicit. But in an event-driven architecture built on Kafka, you have to design it explicitly—and that's where things get interesting.

1

Solving the "Return Address" Problem

In systems with multiple requesters, the replier needs to know where to send the response. Hardcoding reply destinations couples your services and limits multi-tenancy. We borrowed from a familiar concept—the email header—and implemented a "Reply-To" placeholder in our Kafka message headers.

Each requester specifies its own return channel. The replier reads this header, sends the response to the designated topic, and never needs to know anything about the caller's identity or topology. This keeps the replier generic, stateless, and decoupled from the growing number of consuming tenants.

2

The Correlation Identifier

Because Kafka is asynchronous, replies arrive out of order. If a requester fires off 50 document ingestion requests in parallel, it needs a way to match each reply to its original request.

We solved this with Correlation IDs: each request carries a unique Message-ID, and the reply echoes it back as the Correlation-ID. This simple mechanism unlocked full end-to-end tracing, latency metrics across distributed spans, and made debugging production issues dramatically easier.

REQUEST MESSAGE Message-ID "req-4a7b-x29f" Reply-To "tenant-A-replies" Payload { document data... } Kafka REPLY MESSAGE Correlation-ID "req-4a7b-x29f" Topic "tenant-A-replies" Status INGESTED_OK
Fig 1 — Request-Reply message headers: the Reply-To and Correlation ID work together to route and match responses.

Advanced Error Handling in Kafka

If you've worked with RabbitMQ, you know that dead-letter exchanges and built-in retry mechanisms come out of the box. Kafka offers no such thing. Building production-grade error handling on Kafka means designing it from scratch—and we invested significant effort in getting this right.

Poison Pills vs. Transient Failures

The first and most important distinction in any error strategy is separating poison pills—malformed or structurally invalid messages that will never succeed no matter how many times you retry—from transient failures like temporary service outages, network blips, or downstream throttling. Treating these the same way is a recipe for either data loss or infinite retry loops.

Variable Back-Off with Dedicated Error Topics

For transient failures, a naive fixed-interval retry can block consumer threads and cascade failures. Instead, we implemented a variable back-off strategy using multiple Kafka error topics with increasing delays:

// Error topic progression with increasing back-off intervals
error-topic-1m   → retry after 1 minute
error-topic-5m   → retry after 5 minutes
error-topic-10m  → retry after 10 minutes
dead-letter-topicexhausted retries, manual review

Messages flow through these topics progressively. Each topic has a consumer with a corresponding delay before reprocessing. If a message exhausts all retry tiers, it lands in a dead-letter topic for manual investigation. This approach keeps the main consumer group healthy and unblocked while giving transient issues time to resolve.

The "Early Ack" Pattern

One challenge with async retry flows is upstream visibility: the requester submits a document and hears… nothing, for minutes. We introduced an Early Acknowledgement pattern to solve this.

When a message enters the retry pipeline, the system sends an optional, transient status notification back to the requester: "I have your request. It encountered an error, but I am still retrying." This lets the calling system manage user expectations, display meaningful status in the UI, or hold downstream processing—rather than timing out in the dark.

Lesson Learned

In high-volume async pipelines, silence is the worst failure mode. Even if you can't deliver a final result yet, an intermediate acknowledgement builds trust with upstream systems and prevents unnecessary retry storms from impatient callers.

Real-World System Design: The Ingestion Pipeline

Putting these patterns together, our document ingestion pipeline followed a structured four-stage flow:

Stage 1
Transform
Stage 2
Validate
Stage 3
Throttle
Stage 4
Persist

Transformation: Data producers enrich legacy data into a unified, canonical model. This is where we normalize disparate formats from multiple source systems into a single document schema.

Validation: The request consumer performs immediate contract validation. Messages that don't conform to the expected schema are rejected upfront with a descriptive error reply—no point in wasting downstream resources on structurally invalid data.

Throttling: Kafka's throughput can easily overwhelm a search-optimized data store that prioritizes read performance. We implemented rate throttling at the consumer level to protect MongoDB's write path and maintain consistent read latency under load.

Idempotency: Rather than relying on Kafka's "Exactly Once" delivery semantics—which introduces significant configuration complexity and performance overhead—we favored Idempotent Consumers. Each message carries a unique identifier; the consumer checks for duplicates before persisting. This approach delivers high reliability with simpler operational overhead.

Key Takeaways

By applying classic Enterprise Integration Patterns to modern infrastructure like Apache Kafka, we built a system capable of migrating and indexing millions of records while maintaining strict sub-second search SLAs. Here's what I'd emphasize for anyone facing a similar challenge:

Your messaging headers matter as much as your data payload. The Reply-To and Correlation ID patterns are simple to implement but transformative for observability, debugging, and multi-tenant scalability.

Design your error strategy before your happy path. In Kafka, unlike RabbitMQ, resilience isn't free. Variable back-off with tiered error topics and early acknowledgements turned our error handling from a liability into a feature.

Favor idempotency over exactly-once semantics. In practice, idempotent consumers are simpler to reason about, easier to operate, and just as reliable for most use cases.

Optimize for the path your users feel. The deliberate asymmetry—accepting write complexity to protect read performance—is the kind of trade-off that separates architectures that look good on a whiteboard from architectures that perform in production.

Watch the Full Presentation

This post is based on a conference presentation that walks through the complete implementation with live architecture diagrams and code examples. Watch the full talk below:

▶ Request-Reply EIP Implementation using Apache Kafka — Full Conference Talk
Apache Kafka Enterprise Integration Patterns Event-Driven Architecture Spring Boot MongoDB Wealth Management System Design