SlideShare une entreprise Scribd logo
1  sur  22
Distributed messaging with
Apache Kafka
Saumitra Srivastav
@_saumitra_
http://www.meetup.com/Bangalore-Apache-Kafka-Group/
1
Introduction
Kafka is a:
• distributed
• replicated
• persistent
• partitioned
• high throughput
• pub-sub
messaging system.
Incubated at LinkedIn. Written in Scala.
2
Demo Application
Twitter stream analytics
3
Stream
Producer
Broker-1 Broker-2 Broker-3
Twitter
Streaming API
Kafka Cluster
Solr-1
Realtime search
Solr-2 Cassandra-1
Data Store for longer retention
Cassandra-2
Sentiment Analysis
4
Terminology
Topics: categories in which message
feed is maintained
Producer: Processes that publish
messages to a Kafka topic.
Consumers: processes that subscribe
to topics and process the feed of
published messages
Brokers: Servers which form a kafka
cluster and act as a data transport
channel between producers and
consumers.
Producer Producer
Consumer Consumer
Broker
Kafka Cluster
Broker Broker
5
Simplified View of a Kafka System
ZookeeperBroker 1 Broker 2 Broker 3
Producer 1 Producer 2
Consumer 1 Consumer 2 Consumer 3
6
Topics and Partitions
TOPIC – 1
(error log)
TOPIC – 2
(security log)
7
Partitions
• Each partition is an ordered, immutable sequence of
messages.
• Messages are continuously appended to it.
• Each message in partition is assigned a unique
sequential id number called offset.
• Any message in partition can be accessed using this
offset.
8
Partitions
• Partition servers 2 purposes:
1. Scaling
2. Parallelism
• Scaling
A topic can be divided into multiple partition, and
each partition can be on different servers.
• Parallelism
A consumer can consume from multiple partitions at
same time(while maintaining ordering guarantee).
9
Distribution & Replication
• The partitions of the log are distributed over Kafka cluster
• Each server handles data and requests for some number of
partition
• Each partition is replicated for fault tolerance.
• Each partition has one server which acts as the leader.
• The leader handles all read and write requests for the
partition.
• Followers keep replicating the leader.
10
Producers
• Producers publish data to the topics of their choice.
• Producer can choose the topic’s partition to which
message should be assigned.
• Partition can be selected in a round robin manner for
load balancing.
• Kafka doesn’t care about serialization format. All it
need is a byte array.
11
Consumers
• Other messaging systems basically follow 2 models:
• Queuing
• Publish-Subscribe
• Kafka uses a concept of consumer group which generalizes
both these models.
• Consumers label themselves with a consumer group name
• Each message published to a topic, is delivered to one
consumer instance, within each subscribing consumer group.
12
Consumers
13
Consumer Groups
ZookeeperBroker 1 Broker 2 Broker 3
Producer 1 Producer 2
Consumer 1 Consumer 2 Consumer 3
Consumer-Group A Consumer-Group B
14
Consumer groups
Zookeeper
Broker 1
Topic-1
Broker 2
Topic-1
Broker 3
Topic-1
Producer 1 Producer 2
Consumer 1
Consumer-Group A Consumer-Group B
P0 P3 P5 P2 P4
Consumer 2 Consumer 3
15
Message Persistence
• Unlike other messaging system, message are not
deleted on consumption.
• Message are retained until a configurable period of
time after which they are deleted (even if they are
NOT consumed).
• Consumers can re-consume any chunk of older
message using message offset.
• Kafka performance is effectively constant with respect
to data size, so huge data size is not an issue.
16
Demo
Running a multi-broker kafka cluster
17
Guarantees
1. Ordering guarantee
• Messages sent by a producer to a particular topic partition will be
appended in the order they are sent.
• A consumer instance sees messages in the order they are stored in the
log.
2. At least once delivery
3. Fault tolerance
For a topic with replication factor N, up to N-1 server failures will not cause
any data loss.
4. No corruption of data:
• over the network
• On the disk
18
Demo
Consumer/Producer Java API
19
Misc Design features
1. Stateless broker
• Each consumer maintains its own state(offset)
2. Load balancing
3. Asynchronous send
4. Push/pull model instead of Push/Push
5. Consumer Position
6. Offline Data Load
7. Simple API
8. Low Overhead
9. Batch send and receive
10. No message caching in JVM
11. Rely on file system buffering
• mostly sequential access patterns
12. Zero-copy transfer: file->socket
20
Use Cases
1. Messaging
2. Website Activity Tracking
3. Metrics
4. Log Aggregation
5. Stream Processing
21
Thanks
Website: http://kafka.apache.org/
Doc: http://kafka.apache.org/documentation.html
Mailing Lists: users@kafka.apache.org
Questions?
22

Contenu connexe

Tendances

Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka IntroductionAmita Mirajkar
 
Kafka Tutorial: Advanced Producers
Kafka Tutorial: Advanced ProducersKafka Tutorial: Advanced Producers
Kafka Tutorial: Advanced ProducersJean-Paul Azar
 
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's ScalePinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's ScaleSeunghyun Lee
 
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Jean-Paul Azar
 
Apache Kafka as Message Queue for your microservices and other occasions
Apache Kafka as Message Queue for your microservices and other occasionsApache Kafka as Message Queue for your microservices and other occasions
Apache Kafka as Message Queue for your microservices and other occasionsMichael Reinsch
 
APACHE KAFKA / Kafka Connect / Kafka Streams
APACHE KAFKA / Kafka Connect / Kafka StreamsAPACHE KAFKA / Kafka Connect / Kafka Streams
APACHE KAFKA / Kafka Connect / Kafka StreamsKetan Gote
 
Introduction to Spring WebFlux #jsug #sf_a1
Introduction to Spring WebFlux #jsug #sf_a1Introduction to Spring WebFlux #jsug #sf_a1
Introduction to Spring WebFlux #jsug #sf_a1Toshiaki Maki
 
6 Nines: How Stripe keeps Kafka highly-available across the globe with Donny ...
6 Nines: How Stripe keeps Kafka highly-available across the globe with Donny ...6 Nines: How Stripe keeps Kafka highly-available across the globe with Donny ...
6 Nines: How Stripe keeps Kafka highly-available across the globe with Donny ...HostedbyConfluent
 
So You Want to Write a Connector?
So You Want to Write a Connector? So You Want to Write a Connector?
So You Want to Write a Connector? confluent
 
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity PlanningFrom Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planningconfluent
 
Apache Flink Training: System Overview
Apache Flink Training: System OverviewApache Flink Training: System Overview
Apache Flink Training: System OverviewFlink Forward
 
Introduction to Kafka
Introduction to KafkaIntroduction to Kafka
Introduction to KafkaAkash Vacher
 
CI/CD with Rancher CLI + Jenkins
CI/CD with Rancher CLI + JenkinsCI/CD with Rancher CLI + Jenkins
CI/CD with Rancher CLI + JenkinsGo Chiba
 
Testing Kafka components with Kafka for JUnit
Testing Kafka components with Kafka for JUnitTesting Kafka components with Kafka for JUnit
Testing Kafka components with Kafka for JUnitMarkus Günther
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaJiangjie Qin
 
Reliability Guarantees for Apache Kafka
Reliability Guarantees for Apache KafkaReliability Guarantees for Apache Kafka
Reliability Guarantees for Apache Kafkaconfluent
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explainedconfluent
 

Tendances (20)

Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka Introduction
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Kafka Tutorial: Advanced Producers
Kafka Tutorial: Advanced ProducersKafka Tutorial: Advanced Producers
Kafka Tutorial: Advanced Producers
 
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's ScalePinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
 
Apache Kafka - Overview
Apache Kafka - OverviewApache Kafka - Overview
Apache Kafka - Overview
 
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
 
Apache Kafka as Message Queue for your microservices and other occasions
Apache Kafka as Message Queue for your microservices and other occasionsApache Kafka as Message Queue for your microservices and other occasions
Apache Kafka as Message Queue for your microservices and other occasions
 
APACHE KAFKA / Kafka Connect / Kafka Streams
APACHE KAFKA / Kafka Connect / Kafka StreamsAPACHE KAFKA / Kafka Connect / Kafka Streams
APACHE KAFKA / Kafka Connect / Kafka Streams
 
Introduction to Spring WebFlux #jsug #sf_a1
Introduction to Spring WebFlux #jsug #sf_a1Introduction to Spring WebFlux #jsug #sf_a1
Introduction to Spring WebFlux #jsug #sf_a1
 
6 Nines: How Stripe keeps Kafka highly-available across the globe with Donny ...
6 Nines: How Stripe keeps Kafka highly-available across the globe with Donny ...6 Nines: How Stripe keeps Kafka highly-available across the globe with Donny ...
6 Nines: How Stripe keeps Kafka highly-available across the globe with Donny ...
 
So You Want to Write a Connector?
So You Want to Write a Connector? So You Want to Write a Connector?
So You Want to Write a Connector?
 
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity PlanningFrom Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
 
Apache Flink Training: System Overview
Apache Flink Training: System OverviewApache Flink Training: System Overview
Apache Flink Training: System Overview
 
Introduction to Kafka
Introduction to KafkaIntroduction to Kafka
Introduction to Kafka
 
CI/CD with Rancher CLI + Jenkins
CI/CD with Rancher CLI + JenkinsCI/CD with Rancher CLI + Jenkins
CI/CD with Rancher CLI + Jenkins
 
Testing Kafka components with Kafka for JUnit
Testing Kafka components with Kafka for JUnitTesting Kafka components with Kafka for JUnit
Testing Kafka components with Kafka for JUnit
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache Kafka
 
Reliability Guarantees for Apache Kafka
Reliability Guarantees for Apache KafkaReliability Guarantees for Apache Kafka
Reliability Guarantees for Apache Kafka
 
Kafka Deep Dive
Kafka Deep DiveKafka Deep Dive
Kafka Deep Dive
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explained
 

En vedette

Scaling MQTT With Apache Kafka
Scaling MQTT With Apache KafkaScaling MQTT With Apache Kafka
Scaling MQTT With Apache Kafkakellogh
 
Low Latency Mobile Messaging using MQTT
Low Latency Mobile Messaging using MQTTLow Latency Mobile Messaging using MQTT
Low Latency Mobile Messaging using MQTTHenrik Sjöstrand
 
Introducing MQTT
Introducing MQTTIntroducing MQTT
Introducing MQTTAndy Piper
 
MQTT - MQ Telemetry Transport for Message Queueing
MQTT - MQ Telemetry Transport for Message QueueingMQTT - MQ Telemetry Transport for Message Queueing
MQTT - MQ Telemetry Transport for Message QueueingPeter R. Egli
 
Introduction MQTT in English
Introduction MQTT in EnglishIntroduction MQTT in English
Introduction MQTT in EnglishEric Xiao
 
Processing IoT Data with Apache Kafka
Processing IoT Data with Apache KafkaProcessing IoT Data with Apache Kafka
Processing IoT Data with Apache KafkaMatthew Howlett
 
MQTT - A practical protocol for the Internet of Things
MQTT - A practical protocol for the Internet of ThingsMQTT - A practical protocol for the Internet of Things
MQTT - A practical protocol for the Internet of ThingsBryan Boyd
 
Click-Through Example for Flink’s KafkaConsumer Checkpointing
Click-Through Example for Flink’s KafkaConsumer CheckpointingClick-Through Example for Flink’s KafkaConsumer Checkpointing
Click-Through Example for Flink’s KafkaConsumer CheckpointingRobert Metzger
 
Apache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - VerisignApache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - VerisignMichael Noll
 
Intoduction to Android Development
Intoduction to Android DevelopmentIntoduction to Android Development
Intoduction to Android DevelopmentBen Hardill
 
MQTT - The Internet of Things Protocol
MQTT - The Internet of Things ProtocolMQTT - The Internet of Things Protocol
MQTT - The Internet of Things ProtocolBen Hardill
 
Copy of Kafka-Camus
Copy of Kafka-CamusCopy of Kafka-Camus
Copy of Kafka-CamusDeep Shah
 
Introduction to Erlang/(Elixir) at a Webilea Hands-On Session
Introduction to Erlang/(Elixir) at a Webilea Hands-On SessionIntroduction to Erlang/(Elixir) at a Webilea Hands-On Session
Introduction to Erlang/(Elixir) at a Webilea Hands-On SessionAndré Graf
 
VerneMQ @ Paris Erlang User Group June 29th 2015
VerneMQ @ Paris Erlang User Group June 29th 2015VerneMQ @ Paris Erlang User Group June 29th 2015
VerneMQ @ Paris Erlang User Group June 29th 2015André Graf
 
Friends of Solr - Nutch & HDFS
Friends of Solr - Nutch & HDFSFriends of Solr - Nutch & HDFS
Friends of Solr - Nutch & HDFSSaumitra Srivastav
 
An introduction to MQTT - Pub / Sub for the masses
An introduction to MQTT - Pub / Sub for the massesAn introduction to MQTT - Pub / Sub for the masses
An introduction to MQTT - Pub / Sub for the massesDominik Obermaier
 
MQTT - Protocol for the Internet of Things
MQTT - Protocol for the Internet of ThingsMQTT - Protocol for the Internet of Things
MQTT - Protocol for the Internet of ThingsUniversity of Pretoria
 
Mqtt – a protocol for the internet of things
Mqtt – a protocol for the internet of thingsMqtt – a protocol for the internet of things
Mqtt – a protocol for the internet of thingsRahul Gupta
 

En vedette (20)

Scaling MQTT With Apache Kafka
Scaling MQTT With Apache KafkaScaling MQTT With Apache Kafka
Scaling MQTT With Apache Kafka
 
Low Latency Mobile Messaging using MQTT
Low Latency Mobile Messaging using MQTTLow Latency Mobile Messaging using MQTT
Low Latency Mobile Messaging using MQTT
 
Introducing MQTT
Introducing MQTTIntroducing MQTT
Introducing MQTT
 
MQTT - MQ Telemetry Transport for Message Queueing
MQTT - MQ Telemetry Transport for Message QueueingMQTT - MQ Telemetry Transport for Message Queueing
MQTT - MQ Telemetry Transport for Message Queueing
 
Introduction MQTT in English
Introduction MQTT in EnglishIntroduction MQTT in English
Introduction MQTT in English
 
Processing IoT Data with Apache Kafka
Processing IoT Data with Apache KafkaProcessing IoT Data with Apache Kafka
Processing IoT Data with Apache Kafka
 
MQTT - A practical protocol for the Internet of Things
MQTT - A practical protocol for the Internet of ThingsMQTT - A practical protocol for the Internet of Things
MQTT - A practical protocol for the Internet of Things
 
Click-Through Example for Flink’s KafkaConsumer Checkpointing
Click-Through Example for Flink’s KafkaConsumer CheckpointingClick-Through Example for Flink’s KafkaConsumer Checkpointing
Click-Through Example for Flink’s KafkaConsumer Checkpointing
 
Apache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - VerisignApache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - Verisign
 
Intoduction to Android Development
Intoduction to Android DevelopmentIntoduction to Android Development
Intoduction to Android Development
 
MQTT - The Internet of Things Protocol
MQTT - The Internet of Things ProtocolMQTT - The Internet of Things Protocol
MQTT - The Internet of Things Protocol
 
Copy of Kafka-Camus
Copy of Kafka-CamusCopy of Kafka-Camus
Copy of Kafka-Camus
 
Introduction to Erlang/(Elixir) at a Webilea Hands-On Session
Introduction to Erlang/(Elixir) at a Webilea Hands-On SessionIntroduction to Erlang/(Elixir) at a Webilea Hands-On Session
Introduction to Erlang/(Elixir) at a Webilea Hands-On Session
 
VerneMQ @ Paris Erlang User Group June 29th 2015
VerneMQ @ Paris Erlang User Group June 29th 2015VerneMQ @ Paris Erlang User Group June 29th 2015
VerneMQ @ Paris Erlang User Group June 29th 2015
 
Drools Ecosystem
Drools EcosystemDrools Ecosystem
Drools Ecosystem
 
Friends of Solr - Nutch & HDFS
Friends of Solr - Nutch & HDFSFriends of Solr - Nutch & HDFS
Friends of Solr - Nutch & HDFS
 
An introduction to MQTT - Pub / Sub for the masses
An introduction to MQTT - Pub / Sub for the massesAn introduction to MQTT - Pub / Sub for the masses
An introduction to MQTT - Pub / Sub for the masses
 
MQTT - Protocol for the Internet of Things
MQTT - Protocol for the Internet of ThingsMQTT - Protocol for the Internet of Things
MQTT - Protocol for the Internet of Things
 
Mqtt – a protocol for the internet of things
Mqtt – a protocol for the internet of thingsMqtt – a protocol for the internet of things
Mqtt – a protocol for the internet of things
 
Apache Solr Workshop
Apache Solr WorkshopApache Solr Workshop
Apache Solr Workshop
 

Similaire à Apache Kafka distributed messaging

apachekafka-160907180205.pdf
apachekafka-160907180205.pdfapachekafka-160907180205.pdf
apachekafka-160907180205.pdfTarekHamdi8
 
Fundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaFundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaAngelo Cesaro
 
Kafka - Messaging System
Kafka - Messaging SystemKafka - Messaging System
Kafka - Messaging SystemTanuj Mehta
 
Kafka for begginer
Kafka for begginerKafka for begginer
Kafka for begginerYousun Jeong
 
Kafka Overview
Kafka OverviewKafka Overview
Kafka Overviewiamtodor
 
Unleashing Real-time Power with Kafka.pptx
Unleashing Real-time Power with Kafka.pptxUnleashing Real-time Power with Kafka.pptx
Unleashing Real-time Power with Kafka.pptxKnoldus Inc.
 
Session 23 - Kafka and Zookeeper
Session 23 - Kafka and ZookeeperSession 23 - Kafka and Zookeeper
Session 23 - Kafka and ZookeeperAnandMHadoop
 
Introduction to Kafka and Event-Driven
Introduction to Kafka and Event-DrivenIntroduction to Kafka and Event-Driven
Introduction to Kafka and Event-Drivenarconsis
 
Introduction to Kafka and Event-Driven
Introduction to Kafka and Event-DrivenIntroduction to Kafka and Event-Driven
Introduction to Kafka and Event-DrivenDimosthenis Botsaris
 
Multi-Datacenter Kafka - Strata San Jose 2017
Multi-Datacenter Kafka - Strata San Jose 2017Multi-Datacenter Kafka - Strata San Jose 2017
Multi-Datacenter Kafka - Strata San Jose 2017Gwen (Chen) Shapira
 
Kafka 10000 feet view
Kafka 10000 feet viewKafka 10000 feet view
Kafka 10000 feet viewyounessx01
 
Apache kafka- Onkar Kadam
Apache kafka- Onkar KadamApache kafka- Onkar Kadam
Apache kafka- Onkar KadamOnkar Kadam
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache KafkaChhavi Parasher
 
Building an Event Bus at Scale
Building an Event Bus at ScaleBuilding an Event Bus at Scale
Building an Event Bus at Scalejimriecken
 
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...Erik Onnen
 

Similaire à Apache Kafka distributed messaging (20)

Kafka tutorial
Kafka tutorialKafka tutorial
Kafka tutorial
 
apachekafka-160907180205.pdf
apachekafka-160907180205.pdfapachekafka-160907180205.pdf
apachekafka-160907180205.pdf
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Fundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaFundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache Kafka
 
Kafka - Messaging System
Kafka - Messaging SystemKafka - Messaging System
Kafka - Messaging System
 
Kafka for begginer
Kafka for begginerKafka for begginer
Kafka for begginer
 
Kafka Overview
Kafka OverviewKafka Overview
Kafka Overview
 
Unleashing Real-time Power with Kafka.pptx
Unleashing Real-time Power with Kafka.pptxUnleashing Real-time Power with Kafka.pptx
Unleashing Real-time Power with Kafka.pptx
 
Kafka.pptx
Kafka.pptxKafka.pptx
Kafka.pptx
 
Session 23 - Kafka and Zookeeper
Session 23 - Kafka and ZookeeperSession 23 - Kafka and Zookeeper
Session 23 - Kafka and Zookeeper
 
Introduction to Kafka and Event-Driven
Introduction to Kafka and Event-DrivenIntroduction to Kafka and Event-Driven
Introduction to Kafka and Event-Driven
 
Introduction to Kafka and Event-Driven
Introduction to Kafka and Event-DrivenIntroduction to Kafka and Event-Driven
Introduction to Kafka and Event-Driven
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Multi-Datacenter Kafka - Strata San Jose 2017
Multi-Datacenter Kafka - Strata San Jose 2017Multi-Datacenter Kafka - Strata San Jose 2017
Multi-Datacenter Kafka - Strata San Jose 2017
 
Kafka 10000 feet view
Kafka 10000 feet viewKafka 10000 feet view
Kafka 10000 feet view
 
Apache kafka- Onkar Kadam
Apache kafka- Onkar KadamApache kafka- Onkar Kadam
Apache kafka- Onkar Kadam
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache Kafka
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Building an Event Bus at Scale
Building an Event Bus at ScaleBuilding an Event Bus at Scale
Building an Event Bus at Scale
 
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
 

Dernier

20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
detection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxdetection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxAleenaJamil4
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024Timothy Spann
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 

Dernier (20)

20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
detection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxdetection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptx
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 

Apache Kafka distributed messaging

  • 1. Distributed messaging with Apache Kafka Saumitra Srivastav @_saumitra_ http://www.meetup.com/Bangalore-Apache-Kafka-Group/ 1
  • 2. Introduction Kafka is a: • distributed • replicated • persistent • partitioned • high throughput • pub-sub messaging system. Incubated at LinkedIn. Written in Scala. 2
  • 4. Stream Producer Broker-1 Broker-2 Broker-3 Twitter Streaming API Kafka Cluster Solr-1 Realtime search Solr-2 Cassandra-1 Data Store for longer retention Cassandra-2 Sentiment Analysis 4
  • 5. Terminology Topics: categories in which message feed is maintained Producer: Processes that publish messages to a Kafka topic. Consumers: processes that subscribe to topics and process the feed of published messages Brokers: Servers which form a kafka cluster and act as a data transport channel between producers and consumers. Producer Producer Consumer Consumer Broker Kafka Cluster Broker Broker 5
  • 6. Simplified View of a Kafka System ZookeeperBroker 1 Broker 2 Broker 3 Producer 1 Producer 2 Consumer 1 Consumer 2 Consumer 3 6
  • 7. Topics and Partitions TOPIC – 1 (error log) TOPIC – 2 (security log) 7
  • 8. Partitions • Each partition is an ordered, immutable sequence of messages. • Messages are continuously appended to it. • Each message in partition is assigned a unique sequential id number called offset. • Any message in partition can be accessed using this offset. 8
  • 9. Partitions • Partition servers 2 purposes: 1. Scaling 2. Parallelism • Scaling A topic can be divided into multiple partition, and each partition can be on different servers. • Parallelism A consumer can consume from multiple partitions at same time(while maintaining ordering guarantee). 9
  • 10. Distribution & Replication • The partitions of the log are distributed over Kafka cluster • Each server handles data and requests for some number of partition • Each partition is replicated for fault tolerance. • Each partition has one server which acts as the leader. • The leader handles all read and write requests for the partition. • Followers keep replicating the leader. 10
  • 11. Producers • Producers publish data to the topics of their choice. • Producer can choose the topic’s partition to which message should be assigned. • Partition can be selected in a round robin manner for load balancing. • Kafka doesn’t care about serialization format. All it need is a byte array. 11
  • 12. Consumers • Other messaging systems basically follow 2 models: • Queuing • Publish-Subscribe • Kafka uses a concept of consumer group which generalizes both these models. • Consumers label themselves with a consumer group name • Each message published to a topic, is delivered to one consumer instance, within each subscribing consumer group. 12
  • 14. Consumer Groups ZookeeperBroker 1 Broker 2 Broker 3 Producer 1 Producer 2 Consumer 1 Consumer 2 Consumer 3 Consumer-Group A Consumer-Group B 14
  • 15. Consumer groups Zookeeper Broker 1 Topic-1 Broker 2 Topic-1 Broker 3 Topic-1 Producer 1 Producer 2 Consumer 1 Consumer-Group A Consumer-Group B P0 P3 P5 P2 P4 Consumer 2 Consumer 3 15
  • 16. Message Persistence • Unlike other messaging system, message are not deleted on consumption. • Message are retained until a configurable period of time after which they are deleted (even if they are NOT consumed). • Consumers can re-consume any chunk of older message using message offset. • Kafka performance is effectively constant with respect to data size, so huge data size is not an issue. 16
  • 17. Demo Running a multi-broker kafka cluster 17
  • 18. Guarantees 1. Ordering guarantee • Messages sent by a producer to a particular topic partition will be appended in the order they are sent. • A consumer instance sees messages in the order they are stored in the log. 2. At least once delivery 3. Fault tolerance For a topic with replication factor N, up to N-1 server failures will not cause any data loss. 4. No corruption of data: • over the network • On the disk 18
  • 20. Misc Design features 1. Stateless broker • Each consumer maintains its own state(offset) 2. Load balancing 3. Asynchronous send 4. Push/pull model instead of Push/Push 5. Consumer Position 6. Offline Data Load 7. Simple API 8. Low Overhead 9. Batch send and receive 10. No message caching in JVM 11. Rely on file system buffering • mostly sequential access patterns 12. Zero-copy transfer: file->socket 20
  • 21. Use Cases 1. Messaging 2. Website Activity Tracking 3. Metrics 4. Log Aggregation 5. Stream Processing 21