SlideShare a Scribd company logo
1 of 33
Introduction to Kafka
BY DUCAS FRANCIS
The problem
Web
Security
System
Real-time
Monitoring
Logging
System
Other
services
Mobile
API
Job
It’s simple enough at first…
Then it gets a little busy…
And ends up a mess.
The solution
Web
Security
System
Real-time
Monitoring
Logging
System
Other
services
Mobile
API
Job
Pub/Sub
Decouple data pipelines using a pub/sub system
Producers Brokers Consumers
Apache Kafka
A UNIFIED, HIGH-
THROUGHPUT, LOW-LATENCY
PLATFORM FOR HANDLING
REAL-TIME DATA FEEDS
A brief history lesson
 Originally developed at LinkedIn in 2011
 Graduated Apache Incubator in 2012
 Engineers from LinkedIn formed Confluent in 2014
 Up to version 0.9.4 with 0.10 on horizon
Motivation
 Unified platform for all real-time data feeds
 High throughput for high volume streams
 Support periodic data loads from offline systems
 Low latency for traditional messaging
 Support partitioned, distributed, real-time processing
 Guarantee fault-tolerance
Common use cases
 Messaging
 Website activity tracking
 Metrics
 Log aggregation
 Stream processing
 Event sourcing
 Commit log
Benefits of Kafka
 High throughput
 Low latency
 Load balancing
 Fault tolerant
 Guaranteed delivery
 Secure
Performance comparison
Batch performance comparison
Some terminology
 Topic – feed of messages
 Producer – publishes messages to a topic
 Consumer – subscribes to topics and processes the feed of messages
 Broker – server instance that acts in a cluster
@apachekafkapowers
@microsot…
Libraries
 Python – kafka-python / pykafka
 Go – sarama / go_kafka_client / …
 C/C++ - librdkafka / libkafka / …
 .NET – kafka-net (x2) / rdkafka-dotnet / CSharpClient-for-Kafka
 Node.js – kafka-node / sutoiku/node-kafka / ...
 HTTP – kafka-pixy / kafka-rest
 etc.
Architecture
Producer Producer
Broker BrokerBroker
Consumer ConsumerZookeeper
Cluster
x3
Show me the
Kafka!!!
VAGRANT TO THE RESCUE
Anatomy of a topic
 Topics are broken into partitions
 Messages are assigned sequential ID
called and offset
 Data is retained for a configurable
period of time
 Number of partitions can be increased
after creation, but not decreased
 Partitions are assigned to brokers
Each partition is an ordered, immutable sequence of messages that is continually appended to…
a commit log.
Broker
 Kafka service running as part of a cluster
 Receives messages from producers and serves them to consumers
 Coordinated using Zookeeper
 Need odd number for quorum
 Store messages on the file system
 Replicate messages to/from other brokers
 Answer metadata requests about brokers and topics/partitions
 As of 0.9.0 – coordinate consumers
Replication
 Partitions on a topic should be replicated
 Each partition has 1 leader and 0 or more followers
 An In-Sync Replica (ISR) is one that’s communicating with Zookeeper and not too
far behind the leader
 Replication factor can be increased after creation, not decreased
./kafka-topics
--CREATE
--REPLICATION-FACTOR
--PARTITIONS
--DESCRIBE
Producers
 Publishes messages to a topic
 Distributes messages across partitions
 Round-robin
 Key hashing
 Send synchronously or asynchronously to the broker that is the leader for the
partition
 ACKS = 0 (none),1 (leader), -1 (all ISRs)
 Synchronous is obviously slower, but more durable
Testing... Testing…
1 2 3
LET’S SEE HOW FAST WE CAN
PUSH
Consumers
 Read messages from a topic
 Multiple consumers can read from the same topic
 Manage their offsets
 Messages stay on Kafka after they are consumed
Testing... Testing…
1 2 3
LET’S SEE HOW FAST WE CAN
RECEIVE
It’s fast! But why…?
 Efficient protocol based on message set
 Batching messages to reduce network latency and small I/O operations
 Append/chunk messages to increase consumer throughput
 Optimised OS operations
 pagecache
 sendfile()
 Broker services consumers from cache where possible
 End-to-end batch compression
Load balanced consumers
 Distribute load across instances in a group by allocating partitions
 Handle failure by rebalancing partitions to other instances
 Commit their offsets to Kafka
Cluster
Broker 1 Broker 2
P0 P1 P2 P3
Consumer Group 1
C0 C1
Consumer Group 2
C2 C3 C4 C6
Consumer groups and offsets
Cluster
Broker 1 Broker 2
P0 P1 P2 P3
Consumer Group 1
C0 C1
0 1 2 3 4 5 6 7 8 9 10P3
C1
read
C1
commit
C0
read
C0
commit
Guarantees
 Messages sent by a producer to a particular topic’s partition will be appended in
the order they are sent
 A consumer instance sees messages in the order they are stored in the log
 For a topic with replication factor N, we will tolerate up to N-1 server failures
without losing any messages committed to the log
Ordered delivery
 Messages are guaranteed to be delivered in order by partition, NOT topic
M1 M3 M5
M2 M4 M6
P0
P1
 M1 before M3 before M5 – YES
 M1 before M2 – NO
 M2 before M4 before M6 – YES
 M2 before M3 - NO
Enough ALT… now
.NET
USING RDKAFKA-DOTNET
FIN. THANK YOU
Resources
 http://kafka.apache.org/documentation.html
 http://www.confluent.io/
 https://kafka.apache.org/090/configuration.html
 https://github.com/edenhill/librdkafka
 https://github.com/ah-/rdkafka-dotnet
Log compaction
 Keep the most recent payload for a key
 Use cases
 Database change subscription
 Event sourcing
 Journaling for HA
Log compaction

More Related Content

What's hot

Apache Kafka - Free Friday
Apache Kafka - Free FridayApache Kafka - Free Friday
Apache Kafka - Free FridayOtávio Carvalho
 
Hello, kafka! (an introduction to apache kafka)
Hello, kafka! (an introduction to apache kafka)Hello, kafka! (an introduction to apache kafka)
Hello, kafka! (an introduction to apache kafka)Timothy Spann
 
Apache Kafka - Messaging System Overview
Apache Kafka - Messaging System OverviewApache Kafka - Messaging System Overview
Apache Kafka - Messaging System OverviewDmitry Tolpeko
 
Devoxx Morocco 2016 - Microservices with Kafka
Devoxx Morocco 2016 - Microservices with KafkaDevoxx Morocco 2016 - Microservices with Kafka
Devoxx Morocco 2016 - Microservices with KafkaLászló-Róbert Albert
 
Kafka connect 101
Kafka connect 101Kafka connect 101
Kafka connect 101Whiteklay
 
Design Patterns for working with Fast Data
Design Patterns for working with Fast DataDesign Patterns for working with Fast Data
Design Patterns for working with Fast DataMapR Technologies
 
Kafka goutam chowdhury-unicom-spark kafka-summit
Kafka goutam chowdhury-unicom-spark kafka-summitKafka goutam chowdhury-unicom-spark kafka-summit
Kafka goutam chowdhury-unicom-spark kafka-summitGoutam Chowdhury
 
Understanding kafka
Understanding kafkaUnderstanding kafka
Understanding kafkaAmitDhodi
 
Data Pipelines with Kafka Connect
Data Pipelines with Kafka ConnectData Pipelines with Kafka Connect
Data Pipelines with Kafka ConnectKaufman Ng
 
The best of Apache Kafka Architecture
The best of Apache Kafka ArchitectureThe best of Apache Kafka Architecture
The best of Apache Kafka Architecturetechmaddy
 
Building Event-Driven Systems with Apache Kafka
Building Event-Driven Systems with Apache KafkaBuilding Event-Driven Systems with Apache Kafka
Building Event-Driven Systems with Apache KafkaBrian Ritchie
 
Building Stream Infrastructure across Multiple Data Centers with Apache Kafka
Building Stream Infrastructure across Multiple Data Centers with Apache KafkaBuilding Stream Infrastructure across Multiple Data Centers with Apache Kafka
Building Stream Infrastructure across Multiple Data Centers with Apache KafkaGuozhang Wang
 
Protecting your data at rest with Apache Kafka by Confluent and Vormetric
Protecting your data at rest with Apache Kafka by Confluent and VormetricProtecting your data at rest with Apache Kafka by Confluent and Vormetric
Protecting your data at rest with Apache Kafka by Confluent and Vormetricconfluent
 
Kafka clients and emitters
Kafka clients and emittersKafka clients and emitters
Kafka clients and emittersEdgar Domingues
 
Building an Event Bus at Scale
Building an Event Bus at ScaleBuilding an Event Bus at Scale
Building an Event Bus at Scalejimriecken
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache KafkaShiao-An Yuan
 
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the FieldKafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Fieldconfluent
 

What's hot (20)

Apache Kafka - Free Friday
Apache Kafka - Free FridayApache Kafka - Free Friday
Apache Kafka - Free Friday
 
Hello, kafka! (an introduction to apache kafka)
Hello, kafka! (an introduction to apache kafka)Hello, kafka! (an introduction to apache kafka)
Hello, kafka! (an introduction to apache kafka)
 
Apache Kafka - Messaging System Overview
Apache Kafka - Messaging System OverviewApache Kafka - Messaging System Overview
Apache Kafka - Messaging System Overview
 
Devoxx Morocco 2016 - Microservices with Kafka
Devoxx Morocco 2016 - Microservices with KafkaDevoxx Morocco 2016 - Microservices with Kafka
Devoxx Morocco 2016 - Microservices with Kafka
 
Kafka connect 101
Kafka connect 101Kafka connect 101
Kafka connect 101
 
Apache Kafka Best Practices
Apache Kafka Best PracticesApache Kafka Best Practices
Apache Kafka Best Practices
 
Design Patterns for working with Fast Data
Design Patterns for working with Fast DataDesign Patterns for working with Fast Data
Design Patterns for working with Fast Data
 
Kafka connect
Kafka connectKafka connect
Kafka connect
 
Kafka goutam chowdhury-unicom-spark kafka-summit
Kafka goutam chowdhury-unicom-spark kafka-summitKafka goutam chowdhury-unicom-spark kafka-summit
Kafka goutam chowdhury-unicom-spark kafka-summit
 
Understanding kafka
Understanding kafkaUnderstanding kafka
Understanding kafka
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Data Pipelines with Kafka Connect
Data Pipelines with Kafka ConnectData Pipelines with Kafka Connect
Data Pipelines with Kafka Connect
 
The best of Apache Kafka Architecture
The best of Apache Kafka ArchitectureThe best of Apache Kafka Architecture
The best of Apache Kafka Architecture
 
Building Event-Driven Systems with Apache Kafka
Building Event-Driven Systems with Apache KafkaBuilding Event-Driven Systems with Apache Kafka
Building Event-Driven Systems with Apache Kafka
 
Building Stream Infrastructure across Multiple Data Centers with Apache Kafka
Building Stream Infrastructure across Multiple Data Centers with Apache KafkaBuilding Stream Infrastructure across Multiple Data Centers with Apache Kafka
Building Stream Infrastructure across Multiple Data Centers with Apache Kafka
 
Protecting your data at rest with Apache Kafka by Confluent and Vormetric
Protecting your data at rest with Apache Kafka by Confluent and VormetricProtecting your data at rest with Apache Kafka by Confluent and Vormetric
Protecting your data at rest with Apache Kafka by Confluent and Vormetric
 
Kafka clients and emitters
Kafka clients and emittersKafka clients and emitters
Kafka clients and emitters
 
Building an Event Bus at Scale
Building an Event Bus at ScaleBuilding an Event Bus at Scale
Building an Event Bus at Scale
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the FieldKafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
 

Viewers also liked

Building a robot with the .Net Micro Framework
Building a robot with the .Net Micro FrameworkBuilding a robot with the .Net Micro Framework
Building a robot with the .Net Micro FrameworkDucas Francis
 
A Day in the Life of a Metro-veloper
A Day in the Life of a Metro-veloperA Day in the Life of a Metro-veloper
A Day in the Life of a Metro-veloperDucas Francis
 
Seattle kafka meetup nov 2015 published siphon
Seattle kafka meetup nov 2015 published  siphonSeattle kafka meetup nov 2015 published  siphon
Seattle kafka meetup nov 2015 published siphonNitin Kumar
 
Continuous Delivery Pipeline - Patterns and Anti-patterns
Continuous Delivery Pipeline - Patterns and Anti-patternsContinuous Delivery Pipeline - Patterns and Anti-patterns
Continuous Delivery Pipeline - Patterns and Anti-patternsSonatype
 
Introduction to apache kafka
Introduction to apache kafkaIntroduction to apache kafka
Introduction to apache kafkaSamuel Kerrien
 
Apache Kafka lessons learned @PAYBACK
Apache Kafka lessons learned @PAYBACKApache Kafka lessons learned @PAYBACK
Apache Kafka lessons learned @PAYBACKMaxim Shelest
 
Kafka At Scale in the Cloud
Kafka At Scale in the CloudKafka At Scale in the Cloud
Kafka At Scale in the Cloudconfluent
 
Introduction to Kafka and Zookeeper
Introduction to Kafka and ZookeeperIntroduction to Kafka and Zookeeper
Introduction to Kafka and ZookeeperRahul Jain
 

Viewers also liked (9)

Building a robot with the .Net Micro Framework
Building a robot with the .Net Micro FrameworkBuilding a robot with the .Net Micro Framework
Building a robot with the .Net Micro Framework
 
A Day in the Life of a Metro-veloper
A Day in the Life of a Metro-veloperA Day in the Life of a Metro-veloper
A Day in the Life of a Metro-veloper
 
Seattle kafka meetup nov 2015 published siphon
Seattle kafka meetup nov 2015 published  siphonSeattle kafka meetup nov 2015 published  siphon
Seattle kafka meetup nov 2015 published siphon
 
Continuous Delivery Pipeline - Patterns and Anti-patterns
Continuous Delivery Pipeline - Patterns and Anti-patternsContinuous Delivery Pipeline - Patterns and Anti-patterns
Continuous Delivery Pipeline - Patterns and Anti-patterns
 
Kafka internals
Kafka internalsKafka internals
Kafka internals
 
Introduction to apache kafka
Introduction to apache kafkaIntroduction to apache kafka
Introduction to apache kafka
 
Apache Kafka lessons learned @PAYBACK
Apache Kafka lessons learned @PAYBACKApache Kafka lessons learned @PAYBACK
Apache Kafka lessons learned @PAYBACK
 
Kafka At Scale in the Cloud
Kafka At Scale in the CloudKafka At Scale in the Cloud
Kafka At Scale in the Cloud
 
Introduction to Kafka and Zookeeper
Introduction to Kafka and ZookeeperIntroduction to Kafka and Zookeeper
Introduction to Kafka and Zookeeper
 

Similar to Introduction to Kafka

Kafka syed academy_v1_introduction
Kafka syed academy_v1_introductionKafka syed academy_v1_introduction
Kafka syed academy_v1_introductionSyed Hadoop
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Guido Schmutz
 
DevOps Fest 2020. Сергій Калінець. Building Data Streaming Platform with Apac...
DevOps Fest 2020. Сергій Калінець. Building Data Streaming Platform with Apac...DevOps Fest 2020. Сергій Калінець. Building Data Streaming Platform with Apac...
DevOps Fest 2020. Сергій Калінець. Building Data Streaming Platform with Apac...DevOps_Fest
 
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013mumrah
 
Remarks on Grids e-Science CyberInfrastructure and Peer-to-Peer ...
Remarks on Grids e-Science CyberInfrastructure and Peer-to-Peer ...Remarks on Grids e-Science CyberInfrastructure and Peer-to-Peer ...
Remarks on Grids e-Science CyberInfrastructure and Peer-to-Peer ...Videoguy
 
Connecting mq&kafka
Connecting mq&kafkaConnecting mq&kafka
Connecting mq&kafkaMatt Leming
 
NaradaBrokering Grid Messaging and Applications as Web Services
NaradaBrokering Grid Messaging and Applications as Web ServicesNaradaBrokering Grid Messaging and Applications as Web Services
NaradaBrokering Grid Messaging and Applications as Web ServicesVideoguy
 
NaradaBrokering Grid Messaging and Applications as Web Services
NaradaBrokering Grid Messaging and Applications as Web ServicesNaradaBrokering Grid Messaging and Applications as Web Services
NaradaBrokering Grid Messaging and Applications as Web ServicesVideoguy
 
Collaboration and Grid Technologies
Collaboration and Grid TechnologiesCollaboration and Grid Technologies
Collaboration and Grid TechnologiesVideoguy
 
Streaming the platform with Confluent (Apache Kafka)
Streaming the platform with Confluent (Apache Kafka)Streaming the platform with Confluent (Apache Kafka)
Streaming the platform with Confluent (Apache Kafka)GiuseppeBaccini
 
Cluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQCluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQShameera Rathnayaka
 
Kafka and ibm event streams basics
Kafka and ibm event streams basicsKafka and ibm event streams basics
Kafka and ibm event streams basicsBrian S. Paskin
 
AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...
AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...
AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...GeeksLab Odessa
 
A Quick Guide to Refresh Kafka Skills
A Quick Guide to Refresh Kafka SkillsA Quick Guide to Refresh Kafka Skills
A Quick Guide to Refresh Kafka SkillsRavindra kumar
 
Understanding Apache Kafka P99 Latency at Scale
Understanding Apache Kafka P99 Latency at ScaleUnderstanding Apache Kafka P99 Latency at Scale
Understanding Apache Kafka P99 Latency at ScaleScyllaDB
 

Similar to Introduction to Kafka (20)

Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Kafka syed academy_v1_introduction
Kafka syed academy_v1_introductionKafka syed academy_v1_introduction
Kafka syed academy_v1_introduction
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 
DevOps Fest 2020. Сергій Калінець. Building Data Streaming Platform with Apac...
DevOps Fest 2020. Сергій Калінець. Building Data Streaming Platform with Apac...DevOps Fest 2020. Сергій Калінець. Building Data Streaming Platform with Apac...
DevOps Fest 2020. Сергій Калінець. Building Data Streaming Platform with Apac...
 
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
 
Remarks on Grids e-Science CyberInfrastructure and Peer-to-Peer ...
Remarks on Grids e-Science CyberInfrastructure and Peer-to-Peer ...Remarks on Grids e-Science CyberInfrastructure and Peer-to-Peer ...
Remarks on Grids e-Science CyberInfrastructure and Peer-to-Peer ...
 
Connecting mq&kafka
Connecting mq&kafkaConnecting mq&kafka
Connecting mq&kafka
 
ppt
pptppt
ppt
 
ppt
pptppt
ppt
 
NaradaBrokering Grid Messaging and Applications as Web Services
NaradaBrokering Grid Messaging and Applications as Web ServicesNaradaBrokering Grid Messaging and Applications as Web Services
NaradaBrokering Grid Messaging and Applications as Web Services
 
NaradaBrokering Grid Messaging and Applications as Web Services
NaradaBrokering Grid Messaging and Applications as Web ServicesNaradaBrokering Grid Messaging and Applications as Web Services
NaradaBrokering Grid Messaging and Applications as Web Services
 
Collaboration and Grid Technologies
Collaboration and Grid TechnologiesCollaboration and Grid Technologies
Collaboration and Grid Technologies
 
Streaming the platform with Confluent (Apache Kafka)
Streaming the platform with Confluent (Apache Kafka)Streaming the platform with Confluent (Apache Kafka)
Streaming the platform with Confluent (Apache Kafka)
 
Kafka RealTime Streaming
Kafka RealTime StreamingKafka RealTime Streaming
Kafka RealTime Streaming
 
Cluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQCluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQ
 
Kafka and ibm event streams basics
Kafka and ibm event streams basicsKafka and ibm event streams basics
Kafka and ibm event streams basics
 
AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...
AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...
AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...
 
A Quick Guide to Refresh Kafka Skills
A Quick Guide to Refresh Kafka SkillsA Quick Guide to Refresh Kafka Skills
A Quick Guide to Refresh Kafka Skills
 
Understanding Apache Kafka P99 Latency at Scale
Understanding Apache Kafka P99 Latency at ScaleUnderstanding Apache Kafka P99 Latency at Scale
Understanding Apache Kafka P99 Latency at Scale
 

Recently uploaded

AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKJago de Vreede
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 

Recently uploaded (20)

AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 

Introduction to Kafka

Editor's Notes

  1. High throughput – web activity tracking receiving 10’s of events per page hit or interaction. Periodic data loads – every 5min receving 100,000s messages Low latency – pub/sub in ms Distributed – anyone sending or receiving messages should be able to accomplish HA
  2. cd ~/Projects/kafka-vagrant vagrant status vagrant up vagrant ssh kafka-1 cat /etc/kafka/server.properties https://kafka.apache.org/090/configuration.html
  3. Topic – feed of messages Partition – topics are broken into partitions Messages – written to the end of a partition within a topic and assigned a sequential identifier (a 64bit integer) which is called an offset Data is retained within a partition for a configurable amount of time. The time is defaulted in broker configuration, but can be set per topic. Messages are stored on the file system in segmented files. Number of partitions can be increased after creation, but not decreased. This is because (as mentioned) the messages are stored on the file system on a per-partition basis, so reducing partitions would be effectively deleting data. Partitions are assigned to brokers – not topics. Kafka attempts to balance the number of partitions across the available brokers, which can be manually configured too. This is how kafka attempts to load balance its activity because, in theory, each broker having an equal number of partitions should receive an equal number of send and fetch requests.
  4. … The responsibilities of coordination are mixed between ZK and Kafka. Older versions of kafka relied more on ZK, but this is being brought more into the broker and ZK is being used more for service discovery and configuration. Before 0.9.0, consumers were coordinated by ZK and had to have a lot of logic around which partitions were assigned to them. This was changed so that for a new consumer a broker is assigned to be the consumer coordinator and tell the consumers which partitions were assigned to them.
  5. cd ~/Downloads/confluent-2.0.0/bin ls ./kafka-topics ./kafka-topics --create --topic perf-test --partitions 10 --replication-factor 3 --zookeeper 192.168.32.11:2181 ./kafka-topics --list --zookeeper 192.168.32.12 ./kafka-topics --describe --topic perf-test --zookeeper 192.168.32.13
  6. ./kafka-producer-perf-test # Publish 10k x 4kb messages ./kafka-producer-perf-test --topic perf-test --num-records 10000 --record-size 4096 --throughput 1000 --producer-props bootstrap.servers=192.168.32.21:9092,192.168.32.22:9092,192.168.32.23:9092 # Up the throughput for 100k ./kafka-producer-perf-test --topic perf-test --num-records 100000 --record-size 4096 --throughput 100000 --producer-props bootstrap.servers=192.168.32.21:9092,192.168.32.22:9092,192.168.32.23:9092 # No ACKs ./kafka-producer-perf-test --topic perf-test --num-records 1000000 --record-size 4096 --throughput 100000 --producer-props bootstrap.servers=192.168.32.21:9092,192.168.32.22:9092,192.168.32.23:9092 acks=0 # ACKs from all ISRs ./kafka-producer-perf-test --topic perf-test --num-records 1000000 --record-size 4096 --throughput 100000 --producer-props bootstrap.servers=192.168.32.21:9092,192.168.32.22:9092,192.168.32.23:9092 acks=-1 # ACK from leader, use snappy ./kafka-producer-perf-test --topic perf-test --num-records 1000000 --record-size 4096 --throughput 100000 --producer-props bootstrap.servers=192.168.32.21:9092,192.168.32.22:9092,192.168.32.23:9092 acks=1 compression.type=snappy # linger for 100ms ./kafka-producer-perf-test --topic perf-test --num-records 1000000 --record-size 4096 --throughput 100000 --producer-props bootstrap.servers=192.168.32.21:9092,192.168.32.22:9092,192.168.32.23:9092 acks=1 compression.type=snappy linger.ms=100
  7. # Consumer 1M on 5 threads ./kafka-consumer-perf-test --zookeeper 192.168.32.11 --topic perf-test --messages 1000000 --group perf-test --threads 5
  8. Modern OSs maintain a page cache and aggressively use main memory for disk caching. By NOT utilizing this and storing an in-memory representation of data you’re effectively doubling up on the amount of memory you’re application is consuming. By utilizing this you’re utilizing all available RAM for caching without GC penalties. It’s also kept in memory even if the application is restarted. This is obviously advantageous when reading messages, but also when writing. Rather than maintain as much as possible in-memory and flush it all out to the file system in a panic when we run out of space, we invert that. All data is immediately written to a persistent log on the filesystem without necessarily flushing to disk. In effect this just means that it is transferred into the kernel's pagecache. Modern unix operating systems offer a highly optimized code path for transferring data out of pagecache to a socket – the sendfile system call. OS reads data from a file into pagecache in kernel space Application reads from kernel space to a user space buffer Application writes data back to kernel space into a socker buffer OS copies from socket buffer to NIC buffer to send over the network sendfile avoids this by instructing the OS to send data directly from the pagecache to the NIC. This means that consumers that are caught up will be served completely from memory.
  9. Kafka scales topic consumption by distributing partitions among a consumer group, which is a set of consumers sharing a common group identifier. For each group a broker is selected as the group coordinator. The coordinator is responsible for managing the state of the group. Its main job is to mediate partition assignment when new members arrive, old members depart, and when topic metadata changes. The act of reassigning partitions is known as rebalancing the group.