SlideShare une entreprise Scribd logo
1  sur  15
Télécharger pour lire hors ligne
Messaging Queue
-Kafka
What is Messaging Queue ?
Which software is best fit for our service ?
-Amazon SQS, Amazon SNS, Apache Kafka, Rabbit MQ, IBM MQ
Can we create our own Messaging Queue ?
Queue contains sequence of messages, sent between applications, awaiting their turn to be processed.
Message is the data to be sent from producer to consumer.
Why Messaging Queue ?
Why can’t we have Rest APIs everywhere ?
Sync Call
Failed Case
Why Messaging Queue ?
Async Call
Failed Case
Messaging Queue
Kafka
Distributed streaming platform●
Real-time streaming of data.●
Can handle billions of messages in
a day.
●
High throughput, reliability,
replication capabilities.
●
Amazon MSK - Amazon Manager
Streaming for Apache Kafka.
●
Linkedin, Twitter, Netflix, etc.●
Kafka - Terminologies
Kafka Cluster - Cluster of one or more servers (Kafka Brokers) to maintain the load balanced.●
Kafka broker - Broker is a Kafka server. They shares information between each other.●
Bootstrap Server - Server used for the initial connection to Kafka cluster. Consists of
Host:Port.
●
Producer - Produces the message and send to a topic (partition).●
Consumer - Polls the message from the topic (partition).●
Consumer Group - A message can be read by once in each Consumer Group. - SNS Handlers●
Kafka - Terminologies
Topic - To store or publish particular streams of data. A topic can have one or more partitions.●
Partition - To support the parallelism for fast processing. - SQS Messaging Group.●
Segment - Data is stored into segments. A partition is divided into multiple segments.●
Offset - To uniquely identify the message in each partition. It starts from 0 for each partition.●
Zookeeper - Manages election algorithm for brokers. Each partition has its own leader.●
Producer
Sends data with topic only●
Producer partitioner decides the
partition.
○
Default Round-Robin algorithm is used.
We can implement our own.
○
Sends data with topic and Partition Id●
Directly selects the partition and sends
the data.
○
Sends data with topic and Partition Key●
Create a hash value of partition key and
basis that decides partition id.
○
It is similar to SQS message group id.○
Kafka Broker Data Storage
Segments
Segments are named by their base offset. The
base offset of a segment is an offset greater
than offsets in previous segments and less
than or equal to offsets in that segment.
segment.index - The segment index maps
offsets to their message’s position in the
segment log.
●
segment.log - stores the actual message.●
Consumer
All partitions are assigned to the
only consumer
Partitions are equally divided and
assign to the consumers
Each partition maps to each
consumer
When more no. of consumers -
they become idle
Each partition is only consumed by
a single consumer from the group
Partition Allocation
Consumer
Reads messages from a Parition
Offset: from-beginning●
On restart, reads from first available offset.○
Not from 1. As Kafka has default retention of
7 days.
○
Offset: earliest●
On restart, reads from last committed offset.○
Auto commit: commits after 5 sec of poll call.○
Manual commit: send the ack manually to
broker with the offset.
○
Offset: latest●
On restart, reads from the latest message.○
Used for Real-time cases. ○
Types of Message Delivery
At most once delivered●
If the producer does not retry when an ack times out, then the message might end up
not being written to the Kafka topic.
○
Producer waits for only one ack. - acks=1○
20 times faster.○
At least once delivered●
If a producer retries, if the broker had failed right before it sent the ack but after the
message was successfully written to the Kafka topic, this retry leads to the message
being written twice. (Standard SQS)
○
Producer waits for all the ack. -acks=all○
3 times faster.○
Exactly once delivered●
Unique identifier is required. So whenever producer sends the duplicate, broker will not
store that message again. - enable.idempotence=true (FIFO SQS)
○
 Difficult to handle it at consumer end, manual offset needs to handled carefully.○
Alternate way is transaction from producer sends till ack received from consumer.○
Zookeeper
Electing a controller. It maintains the
leader/follower relationship for all the
partitions. 
●
When a node shuts down, it tells other
replicas to become partition leaders.
●
Manage service discovery for Kafka
Brokers that form the cluster.
●
 Sends changes of the topology to
Kafka, so each node in the cluster
knows when a new broker joined, a
Broker died, a topic was removed or a
topic was added, etc. 
●
SQS vs Kafka
Paramter AWS SQS Apache Kafka
Order of Messages Standard Queue: can be out of order
FIFO Queue: in order within message group
in order within the partition
Message Delivery
Standard Queue: At least once delivered
FIFO Queue: Exactly once delivered
provide all three types of message delivery. At-most once,
atleast once and exactly once.
Retention
Default: 4 days
upto 14 days
Default: 7 days
upto 14 days
Metrics CloudWatch Metrics openTSDB - to analyse number of packets
yet to be consumed on each partition/
Security IAM, AWS KMS - Key Management Service Kerberos
Consume same message Connect SQS with SNS Using Consumer Group
Cost
Pay as you use
depends on req/sec and data transfer/sec
Open-source
Server cost and magement cost
Long Polling can reduce cost
max value- 20 sec
Not providing this feature
Poll interval is configurable
Exception Handling Dead-Letter Queues Handle manually - create a separate topic for this
Message Size
Default: 256 KB
to increase further - connect with S3 (support upto 2 GB)
Default: 1 MB
to increase further: change configs of producer, brokers, consumers
Serialization/Deserialization Default: String
Default: String
Avro, protobuf
Throughput Standard Queue: Unlimited
FIFO Queue: 300/sec (10 messages in batch- 3000/sec)
Very High
Questions?

Contenu connexe

Tendances

No data loss pipeline with apache kafka
No data loss pipeline with apache kafkaNo data loss pipeline with apache kafka
No data loss pipeline with apache kafkaJiangjie Qin
 
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Jean-Paul Azar
 
Introduction to Kafka connect
Introduction to Kafka connectIntroduction to Kafka connect
Introduction to Kafka connectKnoldus Inc.
 
Reliability Guarantees for Apache Kafka
Reliability Guarantees for Apache KafkaReliability Guarantees for Apache Kafka
Reliability Guarantees for Apache Kafkaconfluent
 
Building Event-Driven (Micro) Services with Apache Kafka
Building Event-Driven (Micro) Services with Apache KafkaBuilding Event-Driven (Micro) Services with Apache Kafka
Building Event-Driven (Micro) Services with Apache KafkaGuido Schmutz
 
An Introduction to Apache Kafka
An Introduction to Apache KafkaAn Introduction to Apache Kafka
An Introduction to Apache KafkaAmir Sedighi
 
Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?confluent
 
The Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and ContainersThe Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and ContainersSATOSHI TAGOMORI
 
Tuning kafka pipelines
Tuning kafka pipelinesTuning kafka pipelines
Tuning kafka pipelinesSumant Tambe
 

Tendances (20)

No data loss pipeline with apache kafka
No data loss pipeline with apache kafkaNo data loss pipeline with apache kafka
No data loss pipeline with apache kafka
 
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
 
Introduction to Kafka connect
Introduction to Kafka connectIntroduction to Kafka connect
Introduction to Kafka connect
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
Kafka basics
Kafka basicsKafka basics
Kafka basics
 
Reliability Guarantees for Apache Kafka
Reliability Guarantees for Apache KafkaReliability Guarantees for Apache Kafka
Reliability Guarantees for Apache Kafka
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Kafka 101
Kafka 101Kafka 101
Kafka 101
 
Building Event-Driven (Micro) Services with Apache Kafka
Building Event-Driven (Micro) Services with Apache KafkaBuilding Event-Driven (Micro) Services with Apache Kafka
Building Event-Driven (Micro) Services with Apache Kafka
 
An Introduction to Apache Kafka
An Introduction to Apache KafkaAn Introduction to Apache Kafka
An Introduction to Apache Kafka
 
Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?
 
The Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and ContainersThe Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and Containers
 
Kafka internals
Kafka internalsKafka internals
Kafka internals
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Tuning kafka pipelines
Tuning kafka pipelinesTuning kafka pipelines
Tuning kafka pipelines
 
RabbitMQ & Kafka
RabbitMQ & KafkaRabbitMQ & Kafka
RabbitMQ & Kafka
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
Apache Kafka
Apache Kafka Apache Kafka
Apache Kafka
 

Similaire à Messaging queue - Kafka

Introduction to Kafka and Event-Driven
Introduction to Kafka and Event-DrivenIntroduction to Kafka and Event-Driven
Introduction to Kafka and Event-DrivenDimosthenis Botsaris
 
Introduction to Kafka and Event-Driven
Introduction to Kafka and Event-DrivenIntroduction to Kafka and Event-Driven
Introduction to Kafka and Event-Drivenarconsis
 
Kafka Overview
Kafka OverviewKafka Overview
Kafka Overviewiamtodor
 
Copy of Kafka-Camus
Copy of Kafka-CamusCopy of Kafka-Camus
Copy of Kafka-CamusDeep Shah
 
Stateful stream processing with kafka and samza
Stateful stream processing with kafka and samzaStateful stream processing with kafka and samza
Stateful stream processing with kafka and samzaGeorge Li
 
Kafka basics and best prectices
Kafka basics and best precticesKafka basics and best prectices
Kafka basics and best precticesRohitSingh542417
 
Non-Kafkaesque Apache Kafka - Yottabyte 2018
Non-Kafkaesque Apache Kafka - Yottabyte 2018Non-Kafkaesque Apache Kafka - Yottabyte 2018
Non-Kafkaesque Apache Kafka - Yottabyte 2018Otávio Carvalho
 
apachekafka-160907180205.pdf
apachekafka-160907180205.pdfapachekafka-160907180205.pdf
apachekafka-160907180205.pdfTarekHamdi8
 
Cluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQCluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQShameera Rathnayaka
 
A Quick Guide to Refresh Kafka Skills
A Quick Guide to Refresh Kafka SkillsA Quick Guide to Refresh Kafka Skills
A Quick Guide to Refresh Kafka SkillsRavindra kumar
 
Session 23 - Kafka and Zookeeper
Session 23 - Kafka and ZookeeperSession 23 - Kafka and Zookeeper
Session 23 - Kafka and ZookeeperAnandMHadoop
 

Similaire à Messaging queue - Kafka (20)

Kafka Deep Dive
Kafka Deep DiveKafka Deep Dive
Kafka Deep Dive
 
Introduction to Kafka and Event-Driven
Introduction to Kafka and Event-DrivenIntroduction to Kafka and Event-Driven
Introduction to Kafka and Event-Driven
 
Introduction to Kafka and Event-Driven
Introduction to Kafka and Event-DrivenIntroduction to Kafka and Event-Driven
Introduction to Kafka and Event-Driven
 
Kafka Overview
Kafka OverviewKafka Overview
Kafka Overview
 
Copy of Kafka-Camus
Copy of Kafka-CamusCopy of Kafka-Camus
Copy of Kafka-Camus
 
Stateful stream processing with kafka and samza
Stateful stream processing with kafka and samzaStateful stream processing with kafka and samza
Stateful stream processing with kafka and samza
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Kafka basics and best prectices
Kafka basics and best precticesKafka basics and best prectices
Kafka basics and best prectices
 
Kafka tutorial
Kafka tutorialKafka tutorial
Kafka tutorial
 
Non-Kafkaesque Apache Kafka - Yottabyte 2018
Non-Kafkaesque Apache Kafka - Yottabyte 2018Non-Kafkaesque Apache Kafka - Yottabyte 2018
Non-Kafkaesque Apache Kafka - Yottabyte 2018
 
apachekafka-160907180205.pdf
apachekafka-160907180205.pdfapachekafka-160907180205.pdf
apachekafka-160907180205.pdf
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Cluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQCluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQ
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
A Quick Guide to Refresh Kafka Skills
A Quick Guide to Refresh Kafka SkillsA Quick Guide to Refresh Kafka Skills
A Quick Guide to Refresh Kafka Skills
 
Event driven-arch
Event driven-archEvent driven-arch
Event driven-arch
 
Session 23 - Kafka and Zookeeper
Session 23 - Kafka and ZookeeperSession 23 - Kafka and Zookeeper
Session 23 - Kafka and Zookeeper
 
Kafka aws
Kafka awsKafka aws
Kafka aws
 

Dernier

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdfChristopherTHyatt
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 

Dernier (20)

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 

Messaging queue - Kafka

  • 2. What is Messaging Queue ? Which software is best fit for our service ? -Amazon SQS, Amazon SNS, Apache Kafka, Rabbit MQ, IBM MQ Can we create our own Messaging Queue ? Queue contains sequence of messages, sent between applications, awaiting their turn to be processed. Message is the data to be sent from producer to consumer.
  • 3. Why Messaging Queue ? Why can’t we have Rest APIs everywhere ? Sync Call Failed Case
  • 4. Why Messaging Queue ? Async Call Failed Case Messaging Queue
  • 5. Kafka Distributed streaming platform● Real-time streaming of data.● Can handle billions of messages in a day. ● High throughput, reliability, replication capabilities. ● Amazon MSK - Amazon Manager Streaming for Apache Kafka. ● Linkedin, Twitter, Netflix, etc.●
  • 6. Kafka - Terminologies Kafka Cluster - Cluster of one or more servers (Kafka Brokers) to maintain the load balanced.● Kafka broker - Broker is a Kafka server. They shares information between each other.● Bootstrap Server - Server used for the initial connection to Kafka cluster. Consists of Host:Port. ● Producer - Produces the message and send to a topic (partition).● Consumer - Polls the message from the topic (partition).● Consumer Group - A message can be read by once in each Consumer Group. - SNS Handlers●
  • 7. Kafka - Terminologies Topic - To store or publish particular streams of data. A topic can have one or more partitions.● Partition - To support the parallelism for fast processing. - SQS Messaging Group.● Segment - Data is stored into segments. A partition is divided into multiple segments.● Offset - To uniquely identify the message in each partition. It starts from 0 for each partition.● Zookeeper - Manages election algorithm for brokers. Each partition has its own leader.●
  • 8. Producer Sends data with topic only● Producer partitioner decides the partition. ○ Default Round-Robin algorithm is used. We can implement our own. ○ Sends data with topic and Partition Id● Directly selects the partition and sends the data. ○ Sends data with topic and Partition Key● Create a hash value of partition key and basis that decides partition id. ○ It is similar to SQS message group id.○
  • 9. Kafka Broker Data Storage Segments Segments are named by their base offset. The base offset of a segment is an offset greater than offsets in previous segments and less than or equal to offsets in that segment. segment.index - The segment index maps offsets to their message’s position in the segment log. ● segment.log - stores the actual message.●
  • 10. Consumer All partitions are assigned to the only consumer Partitions are equally divided and assign to the consumers Each partition maps to each consumer When more no. of consumers - they become idle Each partition is only consumed by a single consumer from the group Partition Allocation
  • 11. Consumer Reads messages from a Parition Offset: from-beginning● On restart, reads from first available offset.○ Not from 1. As Kafka has default retention of 7 days. ○ Offset: earliest● On restart, reads from last committed offset.○ Auto commit: commits after 5 sec of poll call.○ Manual commit: send the ack manually to broker with the offset. ○ Offset: latest● On restart, reads from the latest message.○ Used for Real-time cases. ○
  • 12. Types of Message Delivery At most once delivered● If the producer does not retry when an ack times out, then the message might end up not being written to the Kafka topic. ○ Producer waits for only one ack. - acks=1○ 20 times faster.○ At least once delivered● If a producer retries, if the broker had failed right before it sent the ack but after the message was successfully written to the Kafka topic, this retry leads to the message being written twice. (Standard SQS) ○ Producer waits for all the ack. -acks=all○ 3 times faster.○ Exactly once delivered● Unique identifier is required. So whenever producer sends the duplicate, broker will not store that message again. - enable.idempotence=true (FIFO SQS) ○  Difficult to handle it at consumer end, manual offset needs to handled carefully.○ Alternate way is transaction from producer sends till ack received from consumer.○
  • 13. Zookeeper Electing a controller. It maintains the leader/follower relationship for all the partitions.  ● When a node shuts down, it tells other replicas to become partition leaders. ● Manage service discovery for Kafka Brokers that form the cluster. ●  Sends changes of the topology to Kafka, so each node in the cluster knows when a new broker joined, a Broker died, a topic was removed or a topic was added, etc.  ●
  • 14. SQS vs Kafka Paramter AWS SQS Apache Kafka Order of Messages Standard Queue: can be out of order FIFO Queue: in order within message group in order within the partition Message Delivery Standard Queue: At least once delivered FIFO Queue: Exactly once delivered provide all three types of message delivery. At-most once, atleast once and exactly once. Retention Default: 4 days upto 14 days Default: 7 days upto 14 days Metrics CloudWatch Metrics openTSDB - to analyse number of packets yet to be consumed on each partition/ Security IAM, AWS KMS - Key Management Service Kerberos Consume same message Connect SQS with SNS Using Consumer Group Cost Pay as you use depends on req/sec and data transfer/sec Open-source Server cost and magement cost Long Polling can reduce cost max value- 20 sec Not providing this feature Poll interval is configurable Exception Handling Dead-Letter Queues Handle manually - create a separate topic for this Message Size Default: 256 KB to increase further - connect with S3 (support upto 2 GB) Default: 1 MB to increase further: change configs of producer, brokers, consumers Serialization/Deserialization Default: String Default: String Avro, protobuf Throughput Standard Queue: Unlimited FIFO Queue: 300/sec (10 messages in batch- 3000/sec) Very High