This document discusses Kafka, a distributed streaming platform. It provides an overview of Kafka's history and key features, including high throughput, horizontal scaling, persistence, and automatic recovery from failures. The document also covers topics, partitions, producers, consumers, consumer groups, and replication strategies. Examples of companies using Kafka are given, such as LinkedIn, Netflix, Twitter, and Microsoft.
9. #DevoxxFR
● Decoupling Systems
● High throughput
● Distributed - Horizontal scaling
● Multi consumers
● Persistence
● Automatic recovery from broker failure
Features
10. #DevoxxFR
● Cost
● Persistence
● Batch system -> perfs down
● Large scale stream processing
● Ordering guarantees
RabbitMQ/ActiveMQ?
17. #DevoxxFR
Consumer Group
10 11 12 13 14 15 16 17 18987654321
10 11 12 13 14987654321 15
10 11 12 13 14 15987654321 16
19
16
17
Producer
Consumer
Group A
Consumer
Group A
Consumer
Group A
Consumer
Group B
Consumer
Group B
Partition #1
Partition #2
Partition #3
Writes
Consumption
18. #DevoxxFR
Fault tolerant consumption
10 11 12 13 14 15 16 17 18987654321
10 11 12 13 14987654321 15
10 11 12 13 14 15987654321 16
19
16
17
Producer
Consumer
Group A
Consumer
Group A
Consumer
Group A
Consumer
Group B
Consumer
Group B
Partition #1
Partition #2
Partition #3
Writes
Automatic
rebalancing
on failure
21. #DevoxxFR
Ka ka 0.9 - New Consumer
● Unified consumer API
● Much simpler and thinner
● Allows for larger groups with far faster
rebalancing
● Decouple Kafka clients from Zookeeper!!!
22. #DevoxxFR
Security
● Authentication : Kerberos / TLS certificate
● Authorization : unix-like permissions
system
● Encryption on the wire : SSL
● Encryption at rest : encrypting individual
fields / filesystem security features
● User defined quota