This is the first post in a series exploring Apache Kafkaβs internal architecture. In this series, weβll dive into:
- Broker Request Processing Flow (this post)
- Storage Layer & Log Segments (coming soon)
- Replication Protocol (coming soon)
- Consumer Groups & Coordination (coming soon)
- and many moreβ¦
Today, weβll focus on how a Kafka broker processes client requests - from receiving bytes on a socket to writing data to disk and sending a response.
High-Level Architecture
Kafka uses a reactor pattern to separate network I/O from request processing. If you want to learn about reactor pattern I recommend this paper from 1995. This architecture enables high throughput by:
- Non-blocking I/O for network operations:
Using Java NIO Selector
, a single Processor
thread monitors hundreds of socket connections simultaneously without blocking.
When a socket has data ready, the selector returns immediately (i.e., no threads waste time waiting on I/O).
- Dedicated thread pools for CPU-intensive work:
Network I/O runs on Processor
threads (typically 3-8; may vary), while request processing runs on a separate pool of KafkaRequestHandler
threads (typically 8-16; also may vary).
This prevents slow request processing from blocking network operations.
- Async callbacks for operations that take time (like replication):
When acks=-1
, the handler thread doesnβt wait for replication.
Instead, it registers a callback with DelayedProduce
and returns to process other requests.
When all replicas acknowledge, the callback fires and sends the response.
- Backpressure mechanisms to prevent overload:
When a request is queued, the broker βmutesβ that socket connection; meaning it stops reading more data from that client until the current request completes. Combined with bounded request queues, this prevents memory exhaustion during traffic spikes.
Request Processing Pipeline
Hereβs the journey of a single produce request through the broker:
Client Acceptor Processor RequestChannel KafkaRequestHandler KafkaApis ReplicaManager
| | | | | | |
|--TCP Connect------>| | | | | |
| | | | | | |
| |--New Socket------>| | | | |
| | | | | | |
|--Produce Request---------------------->| | | | |
| | | | | | |
| | [Parse Request] | | | |
| | | | | | |
| | |--Enqueue---------->| | | |
| | | | | | |
| | [Mute Connection] | | | |
| | | | | | |
| | | |--Dequeue Request---->| | |
| | | | | | |
| | | | |--Route by API----->| |
| | | | | | |
| | | | | |--Append-------->|
| | | | | | |
| | | | | | [Write to Log]
| | | | | | |
| | | | | |<--Callback------|
| | | | | | |
| | | | |<--Response---------| |
| | | | | | |
| | |<--Send Response---------------------------| | |
| | | | | | |
| | [Unmute Connection] | | | |
| | | | | | |
|<--Produce Response---------------------| | | | |
| | | | | | |
Pipeline stages:
Stage | Component | Threads | Purpose |
---|---|---|---|
1. Accept | Acceptor |
1 per listener | Accept TCP connections |
2. Network I/O | Processor |
N per listener | Read/write bytes, parse headers |
3. Queue | RequestChannel |
- | Decouple I/O from processing |
4. Processing | KafkaRequestHandler |
M handlers | Route and execute requests |
5. Response | Back through Processor |
- | Write response to socket |
1. Network Layer - SocketServer
Location: kafka/network/SocketServer.scala
The SocketServer
manages all network communication:
Client β TCP Connection β Acceptor β Processor (NIO Selector)
Processor Event Loop:
configureNewConnections()
- Set up new socketsprocessNewResponses()
- Queue responses to sendpoll()
- NIO selector.select() for ready I/OprocessCompletedReceives()
- Parse requestsprocessCompletedSends()
- Cleanup after sending
When a complete request arrives (line 1019):
- Parse the request header from bytes
- Create a
RequestChannel.Request
object - Send it to the
RequestChannel
- Mute the connection - critical for backpressure! No more data is read until this request is processed
2. Request Queue - RequestChannel
Location: kafka/network/RequestChannel.scala
A bounded queue (configured by queued.max.requests
) that:
- Receives requests from
Processor
threads - Provides requests to
KafkaRequestHandler
threads - Prevents handler threads from blocking I/O threads
3. Handler Thread Pool
Location: kafka/server/KafkaRequestHandler.scala
M handler threads (lines 103-177) run this loop:
while (isRunning) {
val request = requestChannel.receiveRequest() // Block until request available
apis.handle(request, requestLocal) // Process it
}
Metrics tracked: idle time, processing time per API key.
4. Request Router - KafkaApis
Location: kafka/server/KafkaApis.scala
The handle()
method (line 151) is a giant pattern match routing to specific handlers:
request.header.apiKey match {
case ApiKeys.PRODUCE β handleProduceRequest()
case ApiKeys.FETCH β handleFetchRequest()
case ApiKeys.JOIN_GROUP β handleJoinGroupRequest()
case ApiKeys.CREATE_TOPICS β forwardToController()
// ... 50+ API types
}
5. Example: Produce Request
Letβs trace a produce request writing data to a topic:
KafkaApis.handleProduceRequest() (line 388):
1. Authorization
ββ Check transactional ID (if transactional)
ββ Verify topic permissions
ββ ACL checks
2. Validation
ββ Validate record batch format
3. Delegate to ReplicaManager
ββ replicaManager.handleProduceAppend()
4. Define callback
ββ sendResponseCallback() - invoked when append completes
ReplicaManager.handleProduceAppend() (line 734):
1. Transaction verification (if needed)
2. appendToLocalLog()
ββ For each partition:
ββ partition.appendRecordsToLeader()
ββ Write to UnifiedLog (disk)
ββ Update metrics (bytes_in, messages_in)
3. Delayed produce handling (if acks=-1)
ββ Add to DelayedProduce purgatory
ββ Wait for min.isr replicas to acknowledge
6. Response Flow
Once the append completes (or times out):
1. Callback invoked
ββ sendResponseCallback() in KafkaApis
2. Apply quotas
ββ Bandwidth throttling
ββ Request rate limiting
3. Send response
ββ requestChannel.sendResponse()
4. Processor picks up response
ββ processNewResponses()
ββ Write bytes to socket
ββ Unmute connection (ready for next request)
Key Design Patterns
Pattern | Implementation | Benefit |
---|---|---|
Reactor | Processors (I/O) separate from Handlers (logic) | Scales independently |
Thread-per-request | Handler pool processes requests | Isolation, fairness |
Async callbacks | Delayed operations use callbacks | Non-blocking for long ops |
Backpressure | Muted connections + bounded queues | Prevents overload |
Purgatory | Delayed ops wait for conditions | Efficient waiting |
The Purgatory Pattern
Worth highlighting: the DelayedProduce purgatory is Kafkaβs way of efficiently waiting for replication when acks=-1
.
Instead of blocking a handler thread, the operation:
- Gets added to a purgatory (a timing wheel data structure)
- Handler thread is freed to process other requests
- When replicas acknowledge OR timeout expires β callback fires
- Response is sent
Backpressure in Action
The connection muting mechanism (line 1055) is elegant:
Request arrives β Parse β Queue β MUTE CONNECTION
β
Response sent β Write β Process β UNMUTE CONNECTION
This prevents:
- Memory exhaustion from buffering too many requests
- Handler threads drowning in work
- Cascading failures under load
Summary
The Kafka brokerβs request processing architecture achieves high throughput through:
- Separation of concerns: I/O threads vs processing threads
- Non-blocking I/O: NIO selectors for network operations
- Bounded queues: Explicit limits prevent resource exhaustion
- Async patterns: Callbacks and purgatories for long operations
- Flow control: Connection muting provides natural backpressure
In the next post, weβll explore the storage layer i.e., how Kafka efficiently writes and reads data from disk using log segments and memory-mapped files.
This series dives into Kafkaβs source code to understand the design decisions behind one of the most popular distributed systems. All references are to the Kafka codebase (currently Apache Kafka 4.1.0+) .