When it Absolutely, Positively, Has to be There: Reliability Guarantees in Kafka, Gwen Shapira, Jeff Holoman

When it absolutely, positively,
has to be there
Reliability Guarantees in Apache Kafka
@jeffholoman @gwenshap

Kafka
• High Throughput
• Low Latency
• Scalable
• Centralized
• Real-time

“If data is the lifeblood of high technology, Apache
Kafka is the circulatory system”
--Todd Palino
Kafka SRE @ LinkedIn

If Kafka is a critical piece of our pipeline
 Can we be 100% sure that our data will get there?
 Can we lose messages?
 How do we verify?
 Who’s fault is it?

Distributed Systems
 Things Fail
 Systems are designed to
tolerate failure
 We must expect failures
and design our code and
configure our systems to
handle them

Network
Broker MachineClient Machine
Data Flow
Kafka Client
Broker
O/S Socket Buffer
NIC
NIC
Page Cache
Disk
Application Thread
O/S Socket Buffer
async
callback
✗
✗
✗
✗
✗
✗
data
ack /
exception

Client Machine
Kafka Client
O/S Socket Buffer
NIC
Application Thread
✗
✗
Broker Machine
Broker
NIC
Page Cache
Disk
O/S Socket Buffer
miss
✗
✗
✗
Network
Data Flow
✗
data
offsets
ZK
Kafka✗

Replication is your friend
 Kafka protects against failures by replicating data
 The unit of replication is the partition
 One replica is designated as the Leader
 Follower replicas fetch data from the leader
 The leader holds the list of “in-sync” replicas

Replication and ISRs
0
1
2
0
1
2
0
1
2
Producer
Broker
100
Broker
101
Broker
102
Topic:
Partitions:
Replicas:
my_topic
3
3
Partition:
Leader:
ISR:
1
101
100,102
Partition:
Leader:
ISR:
2
102
101,100
Partition:
Leader:
ISR:
0
100
101,102

ISR
• 2 things make a replica in-sync
– Lag behind leader
• replica.lag.time.max.ms – replica that didn’t fetch or is behind
• replica.lag.max.messages – will go away has gone away in 0.9
– Connection to Zookeeper

Terminology
• Acked
– Producers will not retry sending.
– Depends on producer setting
• Committed
– Consumers can read.
– Only when message got to all
ISR.
• replica.lag.time.max.ms
– how long can a dead replica
prevent consumers from
reading?

Replication
• Acks = all
– only waits for in-sync replicas to reply.
Replica 3
100
Replica 2
100
Replica 1
100
Time

• Replica 3 stopped replicating for some reason
Replication
Replica 3
100
Replica 2
100
101
Replica 1
100
101
Time
Acked in acks = all
“committed”
Acked in acks = 1
but not
“committed”

Replication
Replica 3
100
Replica 2
100
101
Replica 1
100
101
Time
• One replica drops out of ISR, or goes offline
• All messages are now acked and committed

• 2nd Replica drops out, or is offline
Replication
Replica 3
100
Replica 2
100
101
Replica 1
100
101
102
103
104Time

Replication
Replica 3
100
Replica 2
100
101
Replica 1
100
101
102
103
104Time
• Now we’re in trouble

Replication
• If Replica 2 or 3 come back online before the leader, you can will lose
data.
Replica 3
100
Replica 2
100
101
Replica 1
100
101
102
103
104Time
All those are
“acked” and
“committed”

So what to do
• Disable Unclean Leader Election
– unclean.leader.election.enable = false
• Set replication factor
– default.replication.factor = 3
• Set minimum ISRs
– min.insync.replicas = 2

Warning
• min.insync.replicas is applied at the topic-level.
• Must alter the topic configuration manually if created
before the server level change
• Must manually alter the topic < 0.9.0 (KAFKA-2114)

Replication
• Replication = 3
• Min ISR = 2
Replica 3
100
Replica 2
100
Replica 1
100
Time

Replication
Replica 3
100
Replica 2
100
101
Replica 1
100
101
Time
• One replica drops out of ISR, or goes offline

Replication
Replica 3
100
Replica 2
100
101
Replica 1
100
101102
103
104
Time
• 2nd Replica fails out, or is out of sync
Buffers
in
Produce
r

Producer Internals
• Producer sends batches of messages to a buffer
M3
Application
Thread
Application
Thread
Application
Thread
send()
M2 M1 M0
Batch 3
Batch 2
Batch 1
Fail
? response
retry
Update Future
callback
drain
Metadata or
Exception

Basics
• Durability can be configured with the producer
configuration request.required.acks
– 0 The message is written to the network (buffer)
– 1 The message is written to the leader
– all The producer gets an ack after all ISRs receive the data; the
message is committed
• Make sure producer doesn’t just throws messages away!
– block.on.buffer.full = true

“New” Producer
• All calls are non-blocking async
• 2 Options for checking for failures:
– Immediately block for response: send().get()
– Do followup work in Callback, close producer after error threshold
• Be careful about buffering these failures. Future work? KAFKA-1955
• Don’t forget to close the producer! producer.close() will block until in-flight txns
complete
• retries (producer config) defaults to 0
• message.send.max.retries (server config) defaults to 3
• In flight requests could lead to message re-ordering

Consumer
• Three choices for Consumer API
– Simple Consumer
– High Level Consumer (ZookeeperConsumer)
– New KafkaConsumer

New Consumer – attempt #1
props.put("enable.auto.commit", "true");
props.put("auto.commit.interval.ms", "10000");
KafkaConsumer<String, String> consumer = new KafkaConsumer<String, String>(props);
consumer.subscribe(Arrays.asList("foo", "bar"));
while (true) {
ConsumerRecords<String, String> records = consumer.poll(100);
for (ConsumerRecord<String, String> record : records) {
processAndUpdateDB(record);
}
}
What if we crash
after 8 seconds?
Commit automatically every
10 seconds

props.put("enable.auto.commit", "false");
while (true) {
consumer.commitSync();
What are you really
committing?

while (true) {
TopicPartition tp = new TopicPartition(record.topic(), record.partition());
OffsetAndMetadata oam = new OffsetAndMetadata(record.offset() +1);
consumer.commitSync(Collections.singletonMap(tp,oam));
Is this fast enough?

int counter = 0;
while (true) {
counter ++;
if (counter % 100 == 0) {
TopicPartition tp = new TopicPartition(record.topic(), record.partition());
OffsetAndMetadata oam = new OffsetAndMetadata(record.offset() + 1);
consumer.commitSync(Collections.singletonMap(tp, oam));

Consumer Offsets
P0 P2 P3 P4 P5 P6
Commit

Consumer Offsets
P0 P2 P3 P4 P5 P6
Consumer
Thread 1 Thread 2 Thread 3 Thread 4
Duplicates

Rebalance Listener
public class MyRebalanceListener implements ConsumerRebalanceListener {
@Override
public void onPartitionsAssigned(Collection<TopicPartition> partitions) {
}
@Override
public void onPartitionsRevoked(Collection<TopicPartition> partitions) {
commitOffsets();
}
}
consumer.subscribe(Arrays.asList("foo", "bar"), new MyRebalanceListener());
Careful! This method will need
to know the topic, partition and
offset of last record you got

At Least Once Consuming
1. Commit your own offsets - Set autocommit.enable =
false
2. Use Rebalance Listener to limit duplicates
3. Make sure you commit only what you are done processing
4. Note: New consumer is single threaded – one consumer
per thread.

Exactly Once Semantics
• At most once is easy
• At least once is not bad either – commit after 100% sure
data is safe
• Exactly once is tricky
– Commit data and offsets in one transaction
– Idempotent producer

Using External Store
• Don’t use commitSync()
• Implement your own “commit” that saves both data and
offsets to external store.
• Use the RebalanceListener to find the correct offset

Seeking right offset
public class SaveOffsetsOnRebalance implements ConsumerRebalanceListener {
private Consumer<?,?> consumer;
public void onPartitionsRevoked(Collection<TopicPartition> partitions) {
// save the offsets in an external store using some custom code not described here
for (TopicPartition partition : partitions)
saveOffsetInExternalStore(consumer.position(partition));
}
public void onPartitionsAssigned(Collection<TopicPartition> partitions) {
// read the offsets from an external store using some custom code not described here
for (TopicPartition partition : partitions)
consumer.seek(partition, readOffsetFromExternalStore(partition));
}
}

Monitoring for Data Loss
• Monitor for producer errors – watch the retry numbers
• Monitor consumer lag – MaxLag or via offsets
• Standard schema:
– Each message should contain timestamp and originating service and
host
• Each producer can report message counts and offsets to a
special topic
• “Monitoring consumer” reports message counts to another
special topic
• “Important consumers” also report message counts
• Reconcile the results

Be Safe, Not Sorry
• Acks = all
• Block.on.buffer.full = true
• Retries = MAX_INT
• ( Max.inflight.requests.per.connect = 1 )
• Producer.close()
• Replication-factor >= 3
• Min.insync.replicas = 2
• Unclean.leader.election = false
• Auto.offset.commit = false
• Commit after processing
• Monitor!

When it Absolutely, Positively, Has to be There: Reliability Guarantees in Kafka, Gwen Shapira, Jeff Holoman

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (20)

Similaire à When it Absolutely, Positively, Has to be There: Reliability Guarantees in Kafka, Gwen Shapira, Jeff Holoman

Similaire à When it Absolutely, Positively, Has to be There: Reliability Guarantees in Kafka, Gwen Shapira, Jeff Holoman (20)

Plus de confluent

Plus de confluent (20)

Dernier

Dernier (20)

When it Absolutely, Positively, Has to be There: Reliability Guarantees in Kafka, Gwen Shapira, Jeff Holoman

Notes de l'éditeur