Kafka Internals Deep Dive

If you love understanding how things actually work, this chapter is for you. If you just want to produce and consume messages, feel free to skip ahead. No judgment.

This chapter reveals the distributed systems magic behind Kafka. We will explore the commit log abstraction, understand how replication guarantees durability, and demystify consumer group coordination. This knowledge is what allows you to tune Kafka for millions of messages per second.

Why Internals Matter

Understanding Kafka internals helps you:

Tune for performance when milliseconds matter
Debug production issues when messages go missing
Design better systems that leverage Kafka’s strengths
Ace interviews where Kafka internals are common
Make informed decisions about partitioning and replication

The Fundamental Abstraction: The Commit Log

At its core, Kafka is an append-only, immutable commit log. This simple abstraction is the source of Kafka’s power.

Topic: user-events
Partition 0:
┌────┬────┬────┬────┬────┬────┬────┬────┬────┬────┐
│ 0  │ 1  │ 2  │ 3  │ 4  │ 5  │ 6  │ 7  │ 8  │ 9  │ ...
└────┴────┴────┴────┴────┴────┴────┴────┴────┴────┘
  ▲                                            ▲
  │                                            │
Oldest                                      Newest
(may be deleted)                         (append here)

Key properties:

Append-only: New messages always go to the end
Immutable: Messages never change after writing
Ordered: Messages have sequential offsets within a partition
Durable: Messages persist to disk (not just memory)

This is fundamentally different from traditional message queues where messages are deleted after consumption.

Log Segments: How Data is Stored

A partition is not one giant file. It is split into segments for manageability.

Partition directory: /kafka-logs/user-events-0/

├── 00000000000000000000.log      # Segment 1: offsets 0-999
├── 00000000000000000000.index    # Offset index
├── 00000000000000000000.timeindex # Timestamp index
├── 00000000000000001000.log      # Segment 2: offsets 1000-1999
├── 00000000000000001000.index
├── 00000000000000001000.timeindex
├── 00000000000000002000.log      # Active segment
└── ...

Segment Structure

Each segment consists of:

File	Purpose
`.log`	Actual message data
`.index`	Maps offset to file position
`.timeindex`	Maps timestamp to offset
`.snapshot`	Producer state for idempotency

Segment Lifecycle

┌─────────────────────────────────────────────────────────────────┐
│                      Segment Lifecycle                           │
│                                                                  │
│  Create ──▶ Active ──▶ Rolled ──▶ Eligible for deletion         │
│                │                                                 │
│                │ When:                                           │
│                │ - Size > segment.bytes (1GB default)           │
│                │ - Age > segment.ms (7 days default)            │
│                │ - Index full                                    │
└─────────────────────────────────────────────────────────────────┘

Log Compaction

For topics where only the latest value per key matters:

Before compaction:
┌─────────────────────────────────────────────────────────────────┐
│ Key:A,V:1 │ Key:B,V:1 │ Key:A,V:2 │ Key:A,V:3 │ Key:B,V:2 │
└─────────────────────────────────────────────────────────────────┘

After compaction:
┌─────────────────────────────────────────────────────────────────┐
│ Key:A,V:3 │ Key:B,V:2 │
└─────────────────────────────────────────────────────────────────┘

Use cases:

CDC (Change Data Capture) - keep latest row state
User profiles - keep latest version
Configuration - keep current settings

The Index Files: Fast Lookups

Reading from offset 1,000,000 should not require scanning 1M messages. Indexes solve this.

Offset Index

Offset Index (.index):
┌─────────────────┬──────────────────┐
│ Relative Offset │ Physical Position│
├─────────────────┼──────────────────┤
│       0         │        0         │
│      100        │     102400       │
│      200        │     204800       │
│      ...        │       ...        │
└─────────────────┴──────────────────┘

Finding offset 150:

Binary search index: 100 < 150 < 200
Start at position 102400
Scan forward until offset 150

Indexes are sparse (not every offset) for space efficiency.

Timestamp Index

Timestamp Index (.timeindex):
┌───────────────┬─────────────────┐
│  Timestamp    │  Relative Offset│
├───────────────┼─────────────────┤
│ 1701590400000 │       0         │
│ 1701590410000 │      100        │
│ 1701590420000 │      200        │
└───────────────┴─────────────────┘

This powers offsetsForTimes() - seek to a specific time.

Replication: The Safety Net

Kafka replicates partitions across brokers for fault tolerance.

Replication Topology

Topic: orders (replication-factor=3)

         ┌──────────────────────────────────────────────────┐
         │                 Partition 0                       │
         ├────────────────────┬──────────────┬──────────────┤
         │    Broker 1        │   Broker 2   │   Broker 3   │
         │    (Leader)        │  (Follower)  │  (Follower)  │
         │    [0,1,2,3,4]     │  [0,1,2,3,4] │  [0,1,2,3]   │
         └────────────────────┴──────────────┴──────────────┘
                                                    ▲
                                                    │
                                              Slightly behind

Leader and Followers

Role	Responsibilities
Leader	Handles all reads and writes for partition
Followers	Replicate from leader, ready to take over

Producers and consumers only talk to the leader. This simplifies consistency.

In-Sync Replicas (ISR)

The ISR is the set of replicas that are “caught up” with the leader:

ISR = [Broker1, Broker2, Broker3]  # All caught up
ISR = [Broker1, Broker2]           # Broker3 fell behind
ISR = [Broker1]                    # Only leader in sync (dangerous!)

A replica falls out of ISR when:

More than replica.lag.time.max.ms behind (default: 30s)
Not sending fetch requests

Durability Guarantees: acks

Producers control durability with acks:

acks	Meaning	Durability	Performance
`0`	Fire and forget	None	Fastest
`1`	Leader acknowledged	Moderate	Fast
`all` (-1)	All ISR acknowledged	Highest	Slowest

acks=all with min.insync.replicas=2:

Producer ──▶ Leader (Broker1)
              │
              ├──▶ Follower (Broker2) ──▶ ACK
              │
              └──▶ Follower (Broker3) ──▶ ACK
                                            │
            ◀──────────────────────────────┘
                    Producer receives ACK

Leader Election

When a leader fails:

Controller detects leader failure (via ZooKeeper/KRaft)
Controller selects new leader from ISR
Controller updates metadata
Brokers and clients fetch new metadata
New leader starts serving requests

Before failure:
  ISR = [Broker1 (Leader), Broker2, Broker3]

Broker1 dies:
  ISR = [Broker2, Broker3]
  Controller elects Broker2 as new leader

After election:
  ISR = [Broker2 (Leader), Broker3]

Unclean leader election: If ISR is empty, Kafka can elect a non-ISR replica (data loss risk). Controlled by unclean.leader.election.enable (default: false).

Producer Internals

Understanding producer internals helps you tune for throughput.

Producer Architecture

┌─────────────────────────────────────────────────────────────────┐
│                        Kafka Producer                            │
│                                                                  │
│  ┌──────────┐   ┌──────────────┐   ┌────────────────────────┐  │
│  │  Record  │──▶│  Serializer  │──▶│     Partitioner        │  │
│  │  (K,V)   │   │              │   │ (Which partition?)     │  │
│  └──────────┘   └──────────────┘   └───────────┬────────────┘  │
│                                                  │               │
│                                    ┌─────────────▼─────────────┐│
│                                    │    Record Accumulator     ││
│                                    │  ┌─────┐ ┌─────┐ ┌─────┐ ││
│                                    │  │Batch│ │Batch│ │Batch│ ││
│                                    │  │ P0  │ │ P1  │ │ P2  │ ││
│                                    │  └─────┘ └─────┘ └─────┘ ││
│                                    └───────────┬───────────────┘│
│                                                │                │
│                                    ┌───────────▼───────────────┐│
│                                    │      Sender Thread        ││
│                                    │   (Network I/O)           ││
│                                    └───────────────────────────┘│
└─────────────────────────────────────────────────────────────────┘

Batching: The Performance Trick

Producers accumulate records into per-partition batches before sending. This is the single biggest performance lever in Kafka — sending 1000 records in one network request is dramatically faster than 1000 individual requests. The analogy: mailing letters one at a time versus filling a mailbag and sending it once.

Config	Purpose	Default	Tuning Guidance
`batch.size`	Max batch size in bytes	16KB	Increase to 64-128KB for high-throughput producers
`linger.ms`	Wait time for more records	0ms	5-20ms is the sweet spot for most workloads
`buffer.memory`	Total memory for buffering	32MB	Increase if you produce to many partitions simultaneously

linger.ms=0:   Record arrives ──▶ Send immediately (lowest latency, worst throughput)
linger.ms=10:  Record arrives ──▶ Wait 10ms ──▶ Send batch (higher throughput, 10ms added latency)

Larger batches = better throughput, higher latency. This is a fundamental trade-off — there is no free lunch. For most applications, 5-10ms of added latency is invisible to users but doubles or triples throughput.

Partitioning Strategy

Default partitioner:

If key is null: Round-robin across partitions
If key is present: hash(key) % numPartitions

// Same key always goes to same partition
producer.send(new ProducerRecord<>("orders", "user-123", orderData));
// user-123 orders always in same partition = ordered processing

Idempotent Producer

Enable with enable.idempotence=true:

Without idempotence:
  Producer sends ──▶ Network error ──▶ Retry ──▶ Duplicate message!

With idempotence:
  Producer sends (PID=1, Seq=5) ──▶ Network error ──▶ Retry (PID=1, Seq=5)
                                                        │
                                                        ▼
                                                   Broker: "Already have Seq=5"
                                                   Deduplicate!

Each producer gets a Producer ID (PID) and sequence numbers per partition.

Consumer Internals

Consumer groups are Kafka’s killer feature for parallel processing.

Consumer Group Coordination

Consumer Group: order-processors

┌─────────────────────────────────────────────────────────────────┐
│                    Group Coordinator                             │
│                    (Broker 2)                                    │
│                                                                  │
│   Manages:                                                       │
│   - Group membership                                             │
│   - Partition assignment                                         │
│   - Offset commits                                               │
└─────────────────────────────────────────────────────────────────┘
                           │
           ┌───────────────┼───────────────┐
           ▼               ▼               ▼
      Consumer 1      Consumer 2      Consumer 3
      [P0, P1]        [P2, P3]        [P4, P5]

The Rebalance Protocol

When group membership changes (consumer joins/leaves), partitions are reassigned:

Consumer sends JoinGroup request
Coordinator waits for all consumers (session.timeout.ms)
Coordinator selects a leader consumer
Leader runs partition assignment algorithm
Coordinator sends assignments to all consumers
Consumers start fetching from assigned partitions

Partition Assignment Strategies

Strategy	Behavior
RangeAssignor	Assign contiguous partitions per topic
RoundRobinAssignor	Distribute partitions evenly across consumers
StickyAssignor	Minimize reassignments during rebalance
CooperativeStickyAssignor	Incremental rebalance (no stop-the-world)

Offset Management

Consumers track progress via offsets:

__consumer_offsets topic (internal):
┌─────────────────┬──────────────────┬─────────────────┐
│  Group          │  Topic-Partition │  Committed      │
├─────────────────┼──────────────────┼─────────────────┤
│  order-procs    │  orders-0        │  1523           │
│  order-procs    │  orders-1        │  1891           │
│  order-procs    │  orders-2        │  1456           │
└─────────────────┴──────────────────┴─────────────────┘

Commit strategies:

Strategy	Code	Risk
Auto commit	`enable.auto.commit=true`	At-least-once, possible duplicates
Sync commit	`consumer.commitSync()`	At-least-once, blocks thread
Async commit	`consumer.commitAsync()`	At-least-once, no blocking
Manual offset	Commit after processing	Exactly-once possible

ZooKeeper vs KRaft

Kafka is transitioning from ZooKeeper to KRaft (Kafka Raft).

ZooKeeper Mode (Legacy)

┌─────────────────────────────────────────────────────────────────┐
│                    ZooKeeper Ensemble                            │
│     (Stores: broker list, topic config, controller election)    │
└────────────────────────────────────┬────────────────────────────┘
                                     │
     ┌───────────────────────────────┼───────────────────────────┐
     ▼                               ▼                           ▼
 Broker 1                        Broker 2                    Broker 3
 (Controller)

ZooKeeper stored:

Broker registration and liveness
Topic and partition metadata
Controller election
ACLs and quotas

KRaft Mode (New Standard)

┌─────────────────────────────────────────────────────────────────┐
│                    Kafka Controllers                             │
│           (Raft consensus, metadata stored in Kafka)            │
│                                                                  │
│     Controller 1 ◀──▶ Controller 2 ◀──▶ Controller 3            │
└────────────────────────────────────┬────────────────────────────┘
                                     │
     ┌───────────────────────────────┼───────────────────────────┐
     ▼                               ▼                           ▼
 Broker 1                        Broker 2                    Broker 3

KRaft advantages:

Simpler operations: One system instead of two
Lower latency: No ZooKeeper round-trips
Better scalability: Millions of partitions possible
Faster recovery: Metadata in Kafka itself

Interview Deep Dive Questions

How does Kafka achieve high throughput?

Answer: 1) Sequential disk I/O (append-only log), 2) Batching (produce/consume in batches), 3) Zero-copy transfer (sendfile syscall), 4) Page cache utilization (OS caches data), 5) Compression (reduce network/disk I/O), 6) Partitioning (parallel processing across brokers).

What is ISR and why does it matter?

Answer: ISR (In-Sync Replicas) is the set of replicas caught up with the leader. When acks=all, producer waits for all ISR replicas to acknowledge. If replica falls behind (lag > replica.lag.time.max.ms), it is removed from ISR. min.insync.replicas sets minimum ISR size for writes to succeed. This balances durability and availability.

How does consumer rebalancing work?

Answer: 1) Consumer sends heartbeats to group coordinator, 2) On membership change, coordinator triggers rebalance, 3) All consumers stop consuming (stop-the-world), 4) Consumers send JoinGroup request, 5) Coordinator elects leader who runs assignment strategy, 6) Coordinator sends assignments, 7) Consumers start consuming assigned partitions. CooperativeStickyAssignor enables incremental rebalance.

Explain exactly-once semantics in Kafka

Answer: Requires: 1) Idempotent producer (enable.idempotence=true) - prevents duplicates via PID+sequence, 2) Transactional producer (transactional.id) - atomic writes across partitions, 3) Consumer with isolation.level=read_committed - only sees committed transactions. Used in Kafka Streams for exactly-once stream processing.

What happens when a broker fails?

Answer: 1) Controller detects failure (missed heartbeats or ZK session loss), 2) For each partition led by failed broker, controller elects new leader from ISR, 3) Controller updates metadata and broadcasts to all brokers, 4) Clients refresh metadata and connect to new leaders, 5) Followers catch up from new leader. If min.insync.replicas not met, partition becomes unavailable for writes.

How does Kafka differ from traditional message queues?

Answer: 1) Log-based vs queue-based (messages not deleted on consumption), 2) Pull model vs push model (consumers control pace), 3) Replayability (seek to any offset), 4) Consumer groups (partitions enable parallel consumption), 5) Retention-based (time/size) vs delivery-based deletion, 6) Higher throughput (millions vs thousands per second), 7) Ordering per partition only.

Tuning Kafka: Key Configurations

Producer Performance

# Batching -- the primary throughput lever
batch.size=65536          # 64KB batches (4x default). Larger batches = fewer network round trips.
linger.ms=10              # Wait up to 10ms to fill the batch. Adds 10ms latency but dramatically
                          # improves throughput by allowing more records to batch together.

# Compression -- reduces both network and disk I/O at the cost of CPU
compression.type=lz4       # lz4 = fastest compression/decompression. Use zstd for better
                           # compression ratio when CPU is cheap and bandwidth is expensive.

# Throughput -- memory allocation for the producer's internal buffer
buffer.memory=67108864     # 64MB buffer across all partitions. If this fills, send() blocks.
max.block.ms=60000         # How long send() blocks when buffer is full before throwing.

Consumer Performance

# Fetch tuning -- controls how much data the consumer pulls per request
fetch.min.bytes=1048576    # Wait for 1MB of data before responding. Higher = better throughput,
                           # but adds latency on low-volume topics (waits for data to accumulate).
fetch.max.wait.ms=500      # Maximum wait time. Fetch completes when EITHER min bytes OR max wait is hit.
max.poll.records=500       # Max records returned per poll() call. Set this based on how long your
                           # processing takes -- if processing 500 records takes > max.poll.interval.ms,
                           # the consumer gets kicked out of the group.

# Per-partition fetch size
max.partition.fetch.bytes=1048576  # 1MB per partition. Increase for large messages.

Broker Performance

# Networking -- thread pools for handling client connections
num.network.threads=8       # Threads that read requests from the socket. Scale with number of clients.
num.io.threads=16           # Threads that process requests (disk I/O, replication). Scale with disk count.

# Log segment configuration
log.segment.bytes=1073741824   # 1GB segments. Smaller = more files, more frequent cleanup.
log.retention.hours=168        # 7 days retention. Balance between storage cost and replay capability.

# Replication -- controls how followers catch up with leaders
num.replica.fetchers=4          # Threads per broker for fetching from leaders. Increase if follower
                                # lag is high and the bottleneck is fetch throughput, not disk I/O.
replica.fetch.max.bytes=1048576 # Max bytes per fetch from leader. Increase for large messages.

Key Takeaways

Kafka is an append-only commit log - simple abstraction, powerful results
Partitions split into segments - enables efficient retention and compaction
Indexes enable fast seeks - sparse offset and timestamp indexes
ISR determines durability - acks=all waits for ISR acknowledgment
Producers batch for throughput - linger.ms and batch.size control this
Consumer groups enable parallelism - partitions assigned to consumers
Rebalancing is expensive - use CooperativeStickyAssignor to minimize impact
KRaft is the future - simpler, faster, scales to millions of partitions

Interview Deep-Dive

Explain how Kafka achieves zero-copy data transfer and why this matters for throughput. Be specific about the system calls involved.

Strong Answer:

Traditional data transfer for a network application involves four copies: disk to kernel page cache, page cache to application buffer, application buffer to socket buffer, socket buffer to NIC. Each copy involves a context switch between user space and kernel space.
Kafka uses the sendfile() system call (on Linux), which transfers data directly from the page cache to the network socket, bypassing the application entirely. This eliminates two of the four copies and two context switches. The data goes: disk to page cache (or already cached), then page cache directly to NIC via DMA. The Kafka broker process never touches the bytes.
This matters enormously for throughput. At LinkedIn’s scale (7+ trillion messages per day), the savings are massive. Without zero-copy, each message would require the broker to allocate application memory, copy bytes in, copy bytes out to the socket, and garbage-collect the memory. With zero-copy, the broker is essentially a routing layer that points the kernel at the right file offset and says “send this to that socket.”
This is also why Kafka performs well with the OS page cache. Kafka is designed to let the OS manage caching rather than implementing its own cache in the JVM heap. Messages written by producers go to the page cache, and messages read by consumers are served from the same page cache — often without touching disk at all for recent data. This design avoids GC pauses that would occur with a JVM-managed cache.

Follow-up: If Kafka relies on the OS page cache, what happens when the machine runs low on memory?The OS evicts page cache pages using an LRU policy to make room for application memory. This means older Kafka data (tail of the log) falls out of cache and subsequent reads hit disk. For consumers reading recent data (the common case), the impact is minimal — recent writes are still in cache. For consumers replaying old data (e.g., a new consumer group starting from the beginning), reads become disk-bound. This is why Kafka brokers should have enough RAM to cache the “hot” portion of data (the most recent segment or two per partition). A common rule of thumb is to size broker RAM at 25-50% of the active data set.

A producer is sending messages with acks=all, but the ops team reports data loss after a broker failure. How is this possible? Walk me through the investigation.

Strong Answer:

The first thing I check is min.insync.replicas. If it is set to 1 (the default), then acks=all only guarantees that the leader acknowledged the write. If the replication factor is 3 but min.insync.replicas is 1, a single broker (the leader) acknowledging is considered “all ISR” when the ISR has shrunk to just the leader. If that leader then crashes before replicating, the data is lost.
The fix is the golden configuration: replication.factor=3, min.insync.replicas=2, acks=all. This ensures at least 2 replicas acknowledge every write. A single broker failure cannot lose data because at least one other replica has the data.
Second thing I check: was unclean.leader.election.enable set to true? If the ISR was empty (all ISR replicas failed simultaneously) and an out-of-sync replica was elected leader, that replica may be missing messages. The new leader’s log becomes the source of truth, and any messages it did not have are permanently lost.
Third: was the producer correctly handling errors? If the producer sends a message, receives a timeout (not an explicit ack), and does not retry, the message may have been written to the leader but the ack was lost in transit. Without idempotent producer (enable.idempotence=true), retries could cause duplicates, so some teams disable retries — which leads to data loss on transient errors.
The complete production configuration I recommend: acks=all, min.insync.replicas=2, replication.factor=3, unclean.leader.election.enable=false, enable.idempotence=true, retries=Integer.MAX_VALUE. This is the only combination that guarantees no data loss under single broker failure.

Follow-up: With min.insync.replicas=2 and only 2 replicas alive out of 3, what happens if one more broker fails?The partition becomes unavailable for writes. The producer receives NotEnoughReplicasException because the ISR has only 1 member, which is below min.insync.replicas=2. Reads may still work (the remaining leader can serve consumers). This is the availability trade-off of strong durability: you sacrifice write availability to prevent data loss. The partition stays unavailable for writes until the second replica recovers and rejoins the ISR. This is by design — the alternative (allowing writes with insufficient replicas) risks data loss, which is worse.

Explain the consumer rebalancing protocol in detail. Why is it considered one of Kafka's biggest operational pain points, and how does the CooperativeStickyAssignor help?

Strong Answer:

The Eager rebalancing protocol (default in older Kafka versions) is a “stop-the-world” event. When any consumer joins or leaves the group: (1) all consumers stop processing and revoke all their partitions, (2) every consumer sends a JoinGroup request to the group coordinator, (3) the coordinator elects a consumer leader who runs the partition assignment algorithm, (4) the coordinator distributes the new assignments, (5) consumers start consuming from their newly assigned partitions.
This is painful because during steps 1-5, no consumer in the group is processing any messages. For a consumer group processing time-sensitive data (fraud detection, real-time dashboards), even a 30-second processing gap is unacceptable. Worse, rebalances are triggered by common events: a consumer process restarting, a deployment rolling update, or a consumer taking too long between poll() calls (exceeding max.poll.interval.ms).
The CooperativeStickyAssignor (introduced in Kafka 2.4) eliminates the stop-the-world pause. It uses an incremental protocol: only the partitions that need to move are revoked from their current owners. All other partitions continue being processed without interruption. A rolling deployment of 10 consumer instances triggers 10 small incremental rebalances instead of one large stop-the-world event, and at each step, only the partitions owned by the restarting instance are temporarily unassigned.
The StickyAssignor (without Cooperative) still uses eager revocation but minimizes partition movement — it tries to keep the same assignment as before. The CooperativeStickyAssignor combines both benefits: minimal movement and no stop-the-world.
In production, I always configure CooperativeStickyAssignor and tune the timeout settings: session.timeout.ms=45000, heartbeat.interval.ms=15000, and ensure max.poll.interval.ms exceeds the maximum processing time for a batch. I also use static group membership (group.instance.id) for consumers with stable identities (like Kubernetes pods with stable hostnames), which avoids rebalancing entirely on temporary disconnections.

Follow-up: During a rolling deployment, each consumer instance restarts one at a time. With the default Eager protocol, how many rebalances occur and what is the total processing gap?For N consumer instances, a rolling deployment triggers 2N rebalances: N leave events and N join events. Each rebalance is a stop-the-world pause of 10-30 seconds (depending on group size and timeout settings). For 10 instances, that is 20 rebalances with a cumulative processing gap of 200-600 seconds — during which no messages are consumed. With CooperativeStickyAssignor and static group membership, the same rolling deployment triggers N incremental rebalances (only on leave, since the rejoin with the same group.instance.id is recognized), each affecting only the partitions of the restarting instance. Total processing gap: near zero for unaffected partitions, and 5-10 seconds per partition that moves.

Ready to build streaming applications? Next up: Kafka Streams where we will transform and aggregate event streams in real-time.

Documentation Index

​Kafka Internals Deep Dive

​Why Internals Matter

​The Fundamental Abstraction: The Commit Log

​Log Segments: How Data is Stored

​Segment Structure

​Segment Lifecycle

​Log Compaction

​The Index Files: Fast Lookups

​Offset Index

​Timestamp Index

​Replication: The Safety Net

​Replication Topology

​Leader and Followers

​In-Sync Replicas (ISR)

​Durability Guarantees: acks

​Leader Election

​Producer Internals

​Producer Architecture

​Batching: The Performance Trick

​Partitioning Strategy

​Idempotent Producer

​Consumer Internals

​Consumer Group Coordination

​The Rebalance Protocol

​Partition Assignment Strategies

​Offset Management

​ZooKeeper vs KRaft

​ZooKeeper Mode (Legacy)

​KRaft Mode (New Standard)

​Interview Deep Dive Questions

​Tuning Kafka: Key Configurations

​Producer Performance

​Consumer Performance

​Broker Performance

​Key Takeaways

​Interview Deep-Dive

Kafka Internals Deep Dive

Why Internals Matter

The Fundamental Abstraction: The Commit Log

Log Segments: How Data is Stored

Segment Structure

Segment Lifecycle

Log Compaction

The Index Files: Fast Lookups

Offset Index

Timestamp Index

Replication: The Safety Net

Replication Topology

Leader and Followers

In-Sync Replicas (ISR)

Durability Guarantees: acks

Leader Election

Producer Internals

Producer Architecture

Batching: The Performance Trick

Partitioning Strategy

Idempotent Producer

Consumer Internals

Consumer Group Coordination

The Rebalance Protocol

Partition Assignment Strategies

Offset Management

ZooKeeper vs KRaft

ZooKeeper Mode (Legacy)

KRaft Mode (New Standard)

Interview Deep Dive Questions

Tuning Kafka: Key Configurations

Producer Performance

Consumer Performance

Broker Performance

Key Takeaways

Interview Deep-Dive