In this talk I go over the basic theory of messaging in distributed systems, the different message delivery guarantees in Kafka and the to use them.
I focus on exactly once delivery guarantees and the way Kafka implements it with transaction based messaging protocol.
Including a discussion of the latency/throughput trade-offs, resource utilisation and its overall advantages and shortcomings.
Finally, I show a use-case at Wix where exactly once delivery helped us solve a big problem.
Advantages of Hiring UIUX Design Service Providers for Your Business
Exactly Once Delivery with Kafka - Kafka Tel-Aviv Meetup
1. natans@wix.com twitter @NSilnitsky linkedin/natansilnitsky github.com/natansil
Exactly Once Delivery
is a harsh mistress
Natan Silnitsky
Backend Infra Developer, Wix.com
6. Classic ecommerce flow - hard to achieve
PurchaseCompleted
UpdateInventory(ItemN)
updates
UpdateInventory(Item1)
UpdateInventory(Item2)
😳 But... how do we update all items exactly once? on failures/restarts too...
Payments Checkout Inventory
14. The options for
message delivery
Why Exactly-Once
is difficult
Kafka’s solution for
Exactly-Once
delivery
15. Kafka Broker
Kafka
Producer
0 1 2 3 4 5
😕 Services need to address message duplicates
at-most-onceat-least-once exactly-once
Producer retries on
every failure
16. Kafka Broker
0 1 2 3 4 5
😕 Services need to handle messages exactly once
consumerRecords = Consumer.poll
process(consumerRecords)
consumer.commit
at-most-onceat-least-once exactly-once
Consumer retries
on every failure
Kafka
Consumer
17. Kafka Broker
0 1 2 3 4 5
😕 Messages may be lost
Consumer commits
before processing
(AutoCommit - Default)
consumerRecords = Consumer.poll
Consumer.commit
process(consumerRecords)
at-most-onceat-least-once exactly-once
Kafka
Consumer
18. Kafka Broker
0 1 2 3 4 5
Kafka
Consumer
😕 Really really really hard to do
Messages are read and processed
exactly once by the consumer
?
at-least-once exactly-onceat-most-once
19. The options for
message delivery
Why Exactly-Once
is difficult
Kafka’s solution for
Exactly-Once
delivery
32. Kafka Broker
Topic A Topic B
Processor Observer
Consumer
-Producer
Consumer
@NSilnitsky
33. Kafka Broker
consumer.poll
Attaches offset to message +
marks transaction
ABCDEF
Processor Observer
AB
01
Topic A Topic B
@NSilnitsky
isolation.level =
"read_committed"
39. Kafka Broker
Throughput impact is between 2% to 25% worse.
The bigger the batch the better the throughput and longer the latency.
Processor Observer
@NSilnitsky
41. Kafka Broker
Deduplicate using offsets from Kafka Transaction
@NSilnitsky
ABC
012
Topic B
Observer
DB Table
Partition Offset Value
0 2 C
... ... ...
42. In reality, this is more complex
than how I describe it.
Out of scope:
▪ Two-phase-commit with transaction coordinator
▪ Transaction log
▪ Additional “fencing” data
▪ 1 processor-producer - 1 partition (up to Kafka 2.4.x)
@NSilnitsky
43. Exactly Once in Kafka Streams
StreamsConfig:
processing.guarantee = "exactly_once"
consumers will be configured with:
● isolation.level = "read_committed"
producers will be configured with:
● enable.idempotence = true
@NSilnitsky
51. Resources
EoS in Kafka talk by Jason Gustafson
Exactly-once Semantics are Possible by Neha Narkhede (Performance)
Transactions in Apache Kafka
Revisiting Exactly One Semantics by Jason Gustafson, Confluent
Proposal to improve EOS Produce scalability - Kafka 2.5
How akka works with exactly once message delivery by Hugh McKee