#DevoxxFR
Hand’s on
Kafka : http://kafka.apache.org/downloads.html
Hand’s on : https://github.com/mblanc/hands_on_kafka.git
#DevoxxFR
@matthieublanc
@slequeux
Matthieu Blanc
Sylvain Lequeux
#DevoxxFR
Messaging System?
#DevoxxFR
Jay Kreps
Neha Narkhede
Jun Rao
History
#DevoxxFR
WebAppRelational
DB
NoSQL
DB
DWH
Hadoop
Monitoring Logs
#DevoxxFR
WebApp
Relational
DB
NoSQL
DB
DWH
Hadoop
ActiveMQ
WebApp
Logs
Monitoring
WebApp
Search
Big Data?
#DevoxxFR
WebApp
Relational
DB
NoSQL
DB
DWH
Hadoop
ActiveMQ
WebApp
Logs
Monitoring
WebApp
Search
BIG
MESS!
#DevoxxFR
Stream Data Platform
#DevoxxFR
● Decoupling Systems
● High throughput
● Distributed - Horizontal scaling
● Multi consumers
● Persistence
● Auto...
#DevoxxFR
● Cost
● Persistence
● Batch system -> perfs down
● Large scale stream processing
● Ordering guarantees
RabbitMQ...
#DevoxxFR
Consumer
Broker
Consumer
Consumer
Kafka Cluster
Broker Broker
Broker Broker Broker
Zookeeper
Producer
Producer
P...
#DevoxxFR
Distributed Commit Logs
10 11 12 13 14 15 16 17 18987654321 19
1st record
Next record
Written
Reads
(sequential ...
#DevoxxFR
Producer
10 11 12 13 14 15 16 17 18987654321
10 11 12 13 14987654321 15
10 11 12 13 14 15987654321 16
Partition ...
#DevoxxFR
Topic storage
10 11 12 13 14 15 16 17 18987654321
Partition #1
directory segment = file
#DevoxxFR
Fast
● Sequential Access
● PageCache
● Linux : sendfile()
● Compression
Source : http://queue.acm.org/detail.cfm...
#DevoxxFR
Fast
#DevoxxFR
Consumer Group
10 11 12 13 14 15 16 17 18987654321
10 11 12 13 14987654321 15
10 11 12 13 14 15987654321 16
19
1...
#DevoxxFR
Fault tolerant consumption
10 11 12 13 14 15 16 17 18987654321
10 11 12 13 14987654321 15
10 11 12 13 14 1598765...
#DevoxxFR
Consumer group
10 11 12 13 14 15 16 17 18987654321
10 11 12 13 14987654321 15
10 11 12 13 14 15987654321 16
Part...
#DevoxxFR
Replicas/ISRs
Partition #0
Partition #2
Topic : foo
Partitions : 3
Replicas : 3
Partition #1
Partition #0
Partit...
#DevoxxFR
Ka ka 0.9 - New Consumer
● Unified consumer API
● Much simpler and thinner
● Allows for larger groups with far f...
#DevoxxFR
Security
● Authentication : Kerberos / TLS certificate
● Authorization : unix-like permissions
system
● Encrypti...
#DevoxxFR
Ka ka Connect
KafkaConnect
Data
Source
KafkaConnect
Data
Sink
Kafka
#DevoxxFR
Ka ka Streams
KafkaConnect
Data
Source
KafkaConnect
Data
Sink
Kafka
Kafka
Streams
#DevoxxFR
Jay Kreps
Neha Narkhede
Jun Rao
Ka ka Enterprise Ready
2011 2012
2014
#DevoxxFR
● User behaviour, click stream analysis
● Infrastructure monitoring and security
● Telemetry data from mobile/se...
#DevoxxFR
Used by
● LinkedIn : activity stream, metrics
● Netflix : Real-time Monitoring
● Twitter : Real-time data pipeli...
#DevoxxFR
GL HF !
● Download Kafka : http://kafka.apache.org/downloads.html
● Git Clone : https://github.com/mblanc/hands_...
Prochain SlideShare
Chargement dans…5
×

Devoxx fr 2016 - Apache Kafka - Stream Data Platform

627 vues

Publié le

Kafka est un système de messagerie distribué, en mode publish-subscribe, persistant les données qu'il reçoit, conçu pour facilement monter en charge et supporter des débits de données très importants.

Originellement développé chez LinkedIn, et maintenu au sein de la fondation Apache depuis 2012, son adoption n'a cessé de croitre pour en faire un quasi de-facto standard dans les pipelines de traitement de données.

Venez découvrir cet outil durant ce Hand's on de 3h où vous installerez un mini cluster Kafka et explorerez ses différentes API. En bonus, vous aurez la possibilité d'analyser vos données en temps réel avec Spark Streaming.

Publié dans : Logiciels
0 commentaire
1 j’aime
Statistiques
Remarques
  • Soyez le premier à commenter

Aucun téléchargement
Vues
Nombre de vues
627
Sur SlideShare
0
Issues des intégrations
0
Intégrations
5
Actions
Partages
0
Téléchargements
15
Commentaires
0
J’aime
1
Intégrations 0
Aucune incorporation

Aucune remarque pour cette diapositive

Devoxx fr 2016 - Apache Kafka - Stream Data Platform

  1. 1. #DevoxxFR Hand’s on Kafka : http://kafka.apache.org/downloads.html Hand’s on : https://github.com/mblanc/hands_on_kafka.git
  2. 2. #DevoxxFR @matthieublanc @slequeux Matthieu Blanc Sylvain Lequeux
  3. 3. #DevoxxFR Messaging System?
  4. 4. #DevoxxFR Jay Kreps Neha Narkhede Jun Rao History
  5. 5. #DevoxxFR WebAppRelational DB NoSQL DB DWH Hadoop Monitoring Logs
  6. 6. #DevoxxFR WebApp Relational DB NoSQL DB DWH Hadoop ActiveMQ WebApp Logs Monitoring WebApp Search Big Data?
  7. 7. #DevoxxFR WebApp Relational DB NoSQL DB DWH Hadoop ActiveMQ WebApp Logs Monitoring WebApp Search BIG MESS!
  8. 8. #DevoxxFR Stream Data Platform
  9. 9. #DevoxxFR ● Decoupling Systems ● High throughput ● Distributed - Horizontal scaling ● Multi consumers ● Persistence ● Automatic recovery from broker failure Features
  10. 10. #DevoxxFR ● Cost ● Persistence ● Batch system -> perfs down ● Large scale stream processing ● Ordering guarantees RabbitMQ/ActiveMQ?
  11. 11. #DevoxxFR Consumer Broker Consumer Consumer Kafka Cluster Broker Broker Broker Broker Broker Zookeeper Producer Producer Producer Architecture
  12. 12. #DevoxxFR Distributed Commit Logs 10 11 12 13 14 15 16 17 18987654321 19 1st record Next record Written Reads (sequential access = high performance)
  13. 13. #DevoxxFR Producer 10 11 12 13 14 15 16 17 18987654321 10 11 12 13 14987654321 15 10 11 12 13 14 15987654321 16 Partition #1 Partition #2 Partition #3 ProducerProducer 19 16 17 offset Old New Writes Writes Writes message : (key bytes[ ], value bytes[ ])
  14. 14. #DevoxxFR Topic storage 10 11 12 13 14 15 16 17 18987654321 Partition #1 directory segment = file
  15. 15. #DevoxxFR Fast ● Sequential Access ● PageCache ● Linux : sendfile() ● Compression Source : http://queue.acm.org/detail.cfm?id=1563874
  16. 16. #DevoxxFR Fast
  17. 17. #DevoxxFR Consumer Group 10 11 12 13 14 15 16 17 18987654321 10 11 12 13 14987654321 15 10 11 12 13 14 15987654321 16 19 16 17 Producer Consumer Group A Consumer Group A Consumer Group A Consumer Group B Consumer Group B Partition #1 Partition #2 Partition #3 Writes Consumption
  18. 18. #DevoxxFR Fault tolerant consumption 10 11 12 13 14 15 16 17 18987654321 10 11 12 13 14987654321 15 10 11 12 13 14 15987654321 16 19 16 17 Producer Consumer Group A Consumer Group A Consumer Group A Consumer Group B Consumer Group B Partition #1 Partition #2 Partition #3 Writes Automatic rebalancing on failure
  19. 19. #DevoxxFR Consumer group 10 11 12 13 14 15 16 17 18987654321 10 11 12 13 14987654321 15 10 11 12 13 14 15987654321 16 Partition #1 Partition #2 Partition #3 Group Topic # Offset 1 log 1 18 1 log 2 12 1 log 3 14 2 log 1 1 2 log 2 0 2 log 3 3 Consumer group 2 Consumer group 1 Old New
  20. 20. #DevoxxFR Replicas/ISRs Partition #0 Partition #2 Topic : foo Partitions : 3 Replicas : 3 Partition #1 Partition #0 Partition #2 Producer Broker #0 Broker #1 Broker #2 Writes Consumer Leader Leader Leader Partition #1 Partition #2 Partition #0 Partition #1
  21. 21. #DevoxxFR Ka ka 0.9 - New Consumer ● Unified consumer API ● Much simpler and thinner ● Allows for larger groups with far faster rebalancing ● Decouple Kafka clients from Zookeeper!!!
  22. 22. #DevoxxFR Security ● Authentication : Kerberos / TLS certificate ● Authorization : unix-like permissions system ● Encryption on the wire : SSL ● Encryption at rest : encrypting individual fields / filesystem security features ● User defined quota
  23. 23. #DevoxxFR Ka ka Connect KafkaConnect Data Source KafkaConnect Data Sink Kafka
  24. 24. #DevoxxFR Ka ka Streams KafkaConnect Data Source KafkaConnect Data Sink Kafka Kafka Streams
  25. 25. #DevoxxFR Jay Kreps Neha Narkhede Jun Rao Ka ka Enterprise Ready 2011 2012 2014
  26. 26. #DevoxxFR ● User behaviour, click stream analysis ● Infrastructure monitoring and security ● Telemetry data from mobile/sensors ● IoT ● Log analysis ● ... Use cases
  27. 27. #DevoxxFR Used by ● LinkedIn : activity stream, metrics ● Netflix : Real-time Monitoring ● Twitter : Real-time data pipeline ● Spotify : log delivery ● Loggly : log collection and processing ● Mozilla : telemetry data ● Microsoft : Ads, Bing, Office ● Airbnb, Square, Uber, Criteo, OVH ...
  28. 28. #DevoxxFR GL HF ! ● Download Kafka : http://kafka.apache.org/downloads.html ● Git Clone : https://github.com/mblanc/hands_on_kafka.git ● Open : reveal.js/index_java.html

×