SlideShare une entreprise Scribd logo
1  sur  4
Quick Guide to Refresh Kafka skills
1. Limitations of Kafka
A. Kafka is an at least once or at most once delivery system
but not Exactly once.
At most once scenario – lose of messages
At least once scenario – duplicates of messages
An at-most-once scenario happens when the commit
interval has occurred, and that in turn triggers Kafka to
automatically commit the last used offset. Meanwhile,
let us say the consumer did not get a chance to complete
the processing of the messages and consumer has
crashed. Now when consumer restarts, it starts to
receive messages from the last committed offset, in
essence consumer could lose a few messages in
between.
At-least-once scenario happens when consumer
processes a message and commits the message into its
persistent store and consumer crashes at that point.
Meanwhile, let us say Kafka could not get a chance to
commit the offset to the broker sincecommitinterval has
not passed. Now when the consumer restarts, it gets
delivered with a few older messages from the last
committed offset.
https://dzone.com/articles/kafka-clients-at-most-once-
at-least-once-exactly-o
More Partitions May Increase Unavailability
In the common casewhen a broker is shutdown cleanly,
the controller will proactively move the leaders off the
shutting down broker one at a time. The moving of a
singleleader takes only a few milliseconds.So, from the
clients perspective, there is only a small window of
unavailability during a clean broker shutdown.
However, when a broker is shutdown uncleanly (e.g., kill
-9), the observed unavailability could be proportional to
the number of partitions. Suppose that a broker has a
total of 2000 partitions, each with 2 replicas. Roughly,
this broker will be the leader for about 1000 partitions.
When this broker failsuncleanly,all those1000 partitions
become unavailable at exactly the same time. Suppose
that it takes 5 ms to elect a new leader for a single
partition. It will take up to 5 seconds to elect the new
leader for all 1000 partitions. So, for some partitions,
their observed unavailability can be 5 seconds plus the
time taken to detect the failure.
More Partitions May Increase End-to-end Latency
The end-to-end latency in Kafka is defined by the time
from when a message is published by the producer to
when the message is read by the consumer. Kafka only
exposes a message to a consumer after it has been
committed, i.e., when the messageis replicated to all the
in-sync replicas.So,the time to commit a message can be
a significant portion of the end-to-end latency. By
default, a Kafka broker only uses a single thread to
replicatedata from another broker, for all partitions that
share replicas between the two brokers. Our
experiments show that replicating 1000 partitions from
one broker to another can add about 20 ms latency,
which implies that the end-to-end latency is at least 20
ms. This can be too high for some real-time applications.
https://de.confluent.io/blog/how-choose-number-
topics-partitions-kafka-cluster
High-Level Stream DSL – Using the Stream DSL, a user
can express transformation, aggregations, grouping, etc.
With each transformation, data has to be serialized and
written into the topic. Then, for the next operation in the
chain, it has to be read from the topic, meaning that all
sideoperations happen for every entity (likepartition key
calculation, persisting to disk, etc.)
Kafka Streams assumes that the Serde class used for
serialization or deserialization is the one provided in the
config. With changingtheformat of data in the operation
chain, the user has to provide the appropriate Serde. If
existingSerdes can'thandletheused format, the user has
to create a custom Serde. No big deal — justextend the
Serde class and implement a custom serializer and
deserializer. From one class, we ended up with 4 — not
so optimal. So, for each custom format of data in the
operation chain, we create three additional classes. An
alternative approach is to use generic JSON or AVRO
Serdes. One more thing: the user has to specify a Serde
for both key and value parts of the message
Restarting of the application. After the breakdown, the
application will go through each internal topic to the last
valid offset, and this can take some time — especially if
logcompaction is notused and/or retention period is not
set up.
https://dzone.com/articles/problem-with-kafka-
streams-1
2. Exactly once delivery in Kafka
A. exactly-once semantics is introduced in Apache Kafka
0.11 release and Confluent Platform 3.3.
https://medium.com/@jaykreps/exactly-once-support-
in-apache-kafka-55e1fdd0a35f
For Previous versions of kafka we can partially achieve
this - Exactly-onceKafka Static Consumer via Assign (One
and Only One Message Delivery)
Steps
Step1 : Set enable.auto.commit = false
Step2 : Don’t make call to consumer.commitSync(); after
processingmessage.
Step3 : Register consumer to specific partition using
‘assign’call.
Step 4 : On startup of the consumer seek to specific
message offset by calling
consumer.seek(topicPartition,offset);
Step 5 : While processing the messages, get hold of the
offset of each message. Store the processed message’s
offset in an atomic way along with the processed
message using atomic-transaction. When data is stored
in relational database atomicity is easier to implement.
For non-relational data-store such as HDFS store or No-
SQL store one way to achieve atomicity is as follows:
Store the offset along with the message.
Step 6 : Implement idempotent as a safety net.
https://dzone.com/articles/kafka-clients-at-most-once-
at-least-once-exactly-o
3. Why Avro produce and consumer is preferred
A. Avro is an open source binary message exchange
protocol. Avro helps to send optimized messages across
the wirehence reducingthe network overhead. Avro can
enforce schema for messages that can be defined using
JSON. Avro can generate binding objects in various
programming languages from these schemas. Message
payloads are automatically bound, and these generate
objects on the consumer side.Avro is natively supported
and highly recommended to use along with Kafka
https://dzone.com/articles/kafka-clients-at-most-once-
at-least-once-exactly-o
4. Schema registry
A. Kafka takes bytes as an input and sends bytes as an
output – No data verification. Obviously, your data has
meaning beyond bytes, so your consumers need to parse
itand later on interpret it.They mainly occur in thesetwo
situations: The field you’re looking for doesn’t exist
anymore. The type of the field has changed (e.g. what
used to be a String is now an Integer)
What are our options to prevent and overcome these
issues?
Catch exception on parsing errors. Your code becomes
ugly and very hard to maintain. 👎
Never ever change the data producer and triple check
your producer code will never forget to send a field.
That’s what most companies do. But after a few key
people quit, all your “safeguards” are gone. 👎👎
Adopt a data format and enforce rules that allowyou to
perform schema evolution while guaranteeing not to
break your downstream applications. 👏 (Sounds too
good to be true ?) - That data format is Apache Avro.
Avro is one of the fastest serializable and deserializable
data formats. It support schema evolution.
The Kafka Avro Serializer
The engineering beauty of this architecture is that now,
your Producers usea new Serializer,provided courtesy of
Confluent, named the KafkaAvroSerializer. Upon
producing Avro data to Kafka, the following will happen
(simplified version):
Your producer will check if the schema is available is in
the Schema Registry. If not available, it will register and
cache it
The Schema Registry will verify if theschema is either the
same as before or a valid evolution. If not, it will return
an exception and the KafkaAvroSerializer will crash your
producer. Better safe than sorry
If the schema is valid and all checks pass, the producer
will only includea reference to the Schema (the Schema
ID) in the message sent to Kafka,not the whole schema.
The advantage of this is that now, your messages sent to
Kafka are much smaller!
https://medium.com/@stephane.maarek/introduction-
to-schemas-in-apache-kafka-with-the-confluent-
schema-registry-3bf55e401321
5. In-Sync replica (AKA ISR)
A. Every topic partition in Kafka isreplicated n times, where
n is the replication factor of the topic. This allows Kafka
to automatically failover to these replicas when a server
in the cluster fails so that messages remain available in
the presence of failures.Replication in Kafka happens at
the partition granularity where the partition’s write-
ahead log is replicated in order to n servers. Out of the n
replicas, one replica is designated as the leader while
others are followers. As the name suggests, the leader
takes the writes from the producer and the followers
merely copy the leader’s log in order.
When a producer sends a message to the broker, it is
written by the leader and replicated to all the partition’s
replicas. A message is committed only after it has been
successfully copied to all the in-sync replicas.
WHAT DOES IT MEAN FOR A REPLICA TO be caughtup to
the leader? - replica that has not “caught up” to the
leader’s log as possibly being marked as an out-of-sync
replica. take an example of a single partition topic foo
with a replication factor of 3.Assumethatthe replicas for
this partition live on brokers 1, 2 and 3 and that 3
messages have been committed on topic foo. Replica on
broker 1 is the current leader and replicas 2 and 3 are
followers and all replicasarepartof the ISR. Also assume
that replica.lag.max.messages is set to 4 which means
that as longas afollower is behind theleader by notmore
than 3 messages,itwill notberemoved from the ISR.And
replica.lag.time.max.ms is set to 500 ms which means
that as long as the followers send a fetch request to the
leader every 500 ms or sooner, they will not be marked
dead and will not be removed from the ISR.
A replica can be out-of-sync with the leader for several
reasons-Slow replica: A follower replica that is
consistently not able to catch up with the writes on the
leader for a certain period of time. One of the most
common reasons for this is an I/O bottleneck on the
follower replica causing it to append the copied
messages at a rate slower than it can consumer from the
leader.
Stuck replica: A follower replica that has stopped
fetching from the leader for a certain period of time. A
replica could be stuck either due to a GC pause or
because it has failed or died.
Bootstrapping replica: When the user increases the
replication factor of the topic, the new follower replicas
are out-of-sync until they are fully caught up to the
leader’s log
https://www.confluent.io/blog/hands-free-kafka-replication-a-
lesson-in-operational-simplicity/
6. Kafka log compaction
A. In the Kafka cluster,the retention policy can be set on a
per-topic basis such as time based, size-based, or log
compaction-based. Log compaction retains at least the
last known value for each record key for a single topic
partition. Compacted logs are useful for restoring state
after a crash or system failure.
Other important usecases is CDC (change data capture)
Kafka logcompaction also allows for deletes. A message
with a key and a null payload acts like a tombstone, a
delete marker for that key. Tombstones get cleared after
a period. Log compaction periodically runs in the
background by recopying log segments. Compaction
does not block reads and can be throttled to avoid
impacting I/O of producers and consumers.
The Kafka Log Cleaner does log compaction. The Log
cleaner has a pool of background compaction threads.
These threads recopy log segment files, removing older
records whose key reappears recently in the log.
Topic config min.compaction.lag.ms gets used to
guarantee a minimum period that must pass before a
message can be compacted. The consumer sees all
tombstones as long as the consumer reaches head of a
log in a period less than the topic config
delete.retention.ms (the default is 24 hours). Log
compaction will never re-order messages, just remove
some. Partition offset for a message never changes.
http://cloudurable.com/blog/kafka-architecture-log-
compaction/index.html
7. kafka record keys for partition Strategy
A. Key is an optional metadata,thatcan besent with a Kafka
message, and by default, itis used to route message to a
specific partition.E.g. if you're sendinga message m with
key as k, to a topic mytopic that has p partitions,then m
goes to the partition Hash(k) % p in mytopic. It has no
connection to the offset of a partition whatsoever.
Offsets are used by consumers to keep track of the
position of last read message in a partition.
If a PartitionKeyStrategy is used with a topic,the valueis
used as the message key, and is then implicitly used to
select the partition according to the default behavior of
the Kafka client:
If a valid partition number is specified that partition will
be used when sending the record. If no partition is
specified but a key is present a partition will be chosen
using a hash of the key. If neither key nor partition is
present a partition will be assigned in a round-robin
fashion.
It might be desirable in some cases to control these
independently. For example, you might wish to have a
message key that is more fine-grained than the partition
key, for use with Kafka log compaction on sub-graphs of
the entity state.
https://blog.newrelic.com/engineering/effective-
strategies-kafka-topic-partitioning/
https://stackoverflow.com/questions/51245962/how-
to-choose-a-key-and-offset-for-a-kafka-producer
8. kafka Mirror Maker
A. Kafka's mirroringfeaturemakes itpossibleto maintain a
replica of an existing Kafka cluster.
Mirror Maker is just a regular Java Producer/Consumer
pair. Data is read from topics in the origin cluster and
written to a topic with the same name in the destination
cluster. You can run many such mirroring processes to
increase throughput and for fault-tolerance (if one
process dies, the others will take overs the additional
load).
Alternative is Confluent Replicator. Confluent Replicator
is a more complete solution that handles topic
configuration as well as data and integrates with Kafka
Connect and Control Center to improve availability,
scalability and ease of use.
https://docs.confluent.io/current/multi-dc-
replicator/mirrormaker.html
9. Challenges faced while working with Kafka
A. Duplicates – at least once behavior – Remove the
duplicates by checkingwith previous persisted messages
or implement exactly once behavior.
Consumer liveness – if a consumer take more time to
process the message. The group coordinator expects
group members to send it regular heartbeats to indicate
that they remain active. A background heartbeat thread
runs in the consumer sending regular heartbeats to the
coordinator. If the coordinator does not receive a
heartbeat from a group member within the session
timeout, the coordinator removes the member from the
group and starts a rebalance of the group. The session
timeout can be much shorter than the maximum polling
interval so that the time taken to detect a failed
consumer can be shorteven if message processingtakes
a long time.
You can configurethe maximumpollinginterval usingthe
max.poll.interval.ms property and the session timeout
usingthe session.timeout.ms property. You will typically
not need to use these settings unless ittakes more than
5 minutes to process a batch of messages.
If you have problems with message handling caused by
message flooding, you can set a consumer option to
control the speed of message consumption. Use
fetch.max.bytes and max.poll.records to control how
much data a call to poll() can return.
https://console.bluemix.net/docs/services/EventStream
s/eventstreams114.html#consuming_messages

Contenu connexe

Tendances

Kafka Security 101 and Real-World Tips
Kafka Security 101 and Real-World Tips Kafka Security 101 and Real-World Tips
Kafka Security 101 and Real-World Tips
confluent
 

Tendances (20)

Spring Boot+Kafka: the New Enterprise Platform
Spring Boot+Kafka: the New Enterprise PlatformSpring Boot+Kafka: the New Enterprise Platform
Spring Boot+Kafka: the New Enterprise Platform
 
Improving Kafka at-least-once performance at Uber
Improving Kafka at-least-once performance at UberImproving Kafka at-least-once performance at Uber
Improving Kafka at-least-once performance at Uber
 
C* Summit 2013: How Not to Use Cassandra by Axel Liljencrantz
C* Summit 2013: How Not to Use Cassandra by Axel LiljencrantzC* Summit 2013: How Not to Use Cassandra by Axel Liljencrantz
C* Summit 2013: How Not to Use Cassandra by Axel Liljencrantz
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
Reliability Guarantees for Apache Kafka
Reliability Guarantees for Apache KafkaReliability Guarantees for Apache Kafka
Reliability Guarantees for Apache Kafka
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
A visual introduction to Apache Kafka
A visual introduction to Apache KafkaA visual introduction to Apache Kafka
A visual introduction to Apache Kafka
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Messaging queue - Kafka
Messaging queue - KafkaMessaging queue - Kafka
Messaging queue - Kafka
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
kafka
kafkakafka
kafka
 
When NOT to use Apache Kafka?
When NOT to use Apache Kafka?When NOT to use Apache Kafka?
When NOT to use Apache Kafka?
 
Building an Interactive Query Service in Kafka Streams With Bill Bejeck | Cur...
Building an Interactive Query Service in Kafka Streams With Bill Bejeck | Cur...Building an Interactive Query Service in Kafka Streams With Bill Bejeck | Cur...
Building an Interactive Query Service in Kafka Streams With Bill Bejeck | Cur...
 
Apache ZooKeeper
Apache ZooKeeperApache ZooKeeper
Apache ZooKeeper
 
Kafka internals
Kafka internalsKafka internals
Kafka internals
 
Kafka Security 101 and Real-World Tips
Kafka Security 101 and Real-World Tips Kafka Security 101 and Real-World Tips
Kafka Security 101 and Real-World Tips
 
Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka Introduction
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explained
 
APACHE KAFKA / Kafka Connect / Kafka Streams
APACHE KAFKA / Kafka Connect / Kafka StreamsAPACHE KAFKA / Kafka Connect / Kafka Streams
APACHE KAFKA / Kafka Connect / Kafka Streams
 
Apache Kafka at LinkedIn
Apache Kafka at LinkedInApache Kafka at LinkedIn
Apache Kafka at LinkedIn
 

Similaire à A Quick Guide to Refresh Kafka Skills

Cluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQCluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQ
Shameera Rathnayaka
 
Copy of Kafka-Camus
Copy of Kafka-CamusCopy of Kafka-Camus
Copy of Kafka-Camus
Deep Shah
 
apachekafka-160907180205.pdf
apachekafka-160907180205.pdfapachekafka-160907180205.pdf
apachekafka-160907180205.pdf
TarekHamdi8
 

Similaire à A Quick Guide to Refresh Kafka Skills (20)

Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Session 23 - Kafka and Zookeeper
Session 23 - Kafka and ZookeeperSession 23 - Kafka and Zookeeper
Session 23 - Kafka and Zookeeper
 
Cluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQCluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQ
 
Introduction to Kafka Streams Presentation
Introduction to Kafka Streams PresentationIntroduction to Kafka Streams Presentation
Introduction to Kafka Streams Presentation
 
Kafka Fundamentals
Kafka FundamentalsKafka Fundamentals
Kafka Fundamentals
 
Kafka syed academy_v1_introduction
Kafka syed academy_v1_introductionKafka syed academy_v1_introduction
Kafka syed academy_v1_introduction
 
kafka_session_updated.pptx
kafka_session_updated.pptxkafka_session_updated.pptx
kafka_session_updated.pptx
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
DevOps Fest 2020. Сергій Калінець. Building Data Streaming Platform with Apac...
DevOps Fest 2020. Сергій Калінець. Building Data Streaming Platform with Apac...DevOps Fest 2020. Сергій Калінець. Building Data Streaming Platform with Apac...
DevOps Fest 2020. Сергій Калінець. Building Data Streaming Platform with Apac...
 
Kafka Deep Dive
Kafka Deep DiveKafka Deep Dive
Kafka Deep Dive
 
Kafka RealTime Streaming
Kafka RealTime StreamingKafka RealTime Streaming
Kafka RealTime Streaming
 
Copy of Kafka-Camus
Copy of Kafka-CamusCopy of Kafka-Camus
Copy of Kafka-Camus
 
apachekafka-160907180205.pdf
apachekafka-160907180205.pdfapachekafka-160907180205.pdf
apachekafka-160907180205.pdf
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Kafka tutorial
Kafka tutorialKafka tutorial
Kafka tutorial
 
Apache Kafka - Messaging System Overview
Apache Kafka - Messaging System OverviewApache Kafka - Messaging System Overview
Apache Kafka - Messaging System Overview
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Removing performance bottlenecks with Kafka Monitoring and topic configuration
Removing performance bottlenecks with Kafka Monitoring and topic configurationRemoving performance bottlenecks with Kafka Monitoring and topic configuration
Removing performance bottlenecks with Kafka Monitoring and topic configuration
 
Notes leo kafka
Notes leo kafkaNotes leo kafka
Notes leo kafka
 
SA UNIT II KAFKA.pdf
SA UNIT II KAFKA.pdfSA UNIT II KAFKA.pdf
SA UNIT II KAFKA.pdf
 

Dernier

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Dernier (20)

What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 

A Quick Guide to Refresh Kafka Skills

  • 1. Quick Guide to Refresh Kafka skills 1. Limitations of Kafka A. Kafka is an at least once or at most once delivery system but not Exactly once. At most once scenario – lose of messages At least once scenario – duplicates of messages An at-most-once scenario happens when the commit interval has occurred, and that in turn triggers Kafka to automatically commit the last used offset. Meanwhile, let us say the consumer did not get a chance to complete the processing of the messages and consumer has crashed. Now when consumer restarts, it starts to receive messages from the last committed offset, in essence consumer could lose a few messages in between. At-least-once scenario happens when consumer processes a message and commits the message into its persistent store and consumer crashes at that point. Meanwhile, let us say Kafka could not get a chance to commit the offset to the broker sincecommitinterval has not passed. Now when the consumer restarts, it gets delivered with a few older messages from the last committed offset. https://dzone.com/articles/kafka-clients-at-most-once- at-least-once-exactly-o More Partitions May Increase Unavailability In the common casewhen a broker is shutdown cleanly, the controller will proactively move the leaders off the shutting down broker one at a time. The moving of a singleleader takes only a few milliseconds.So, from the clients perspective, there is only a small window of unavailability during a clean broker shutdown. However, when a broker is shutdown uncleanly (e.g., kill -9), the observed unavailability could be proportional to the number of partitions. Suppose that a broker has a total of 2000 partitions, each with 2 replicas. Roughly, this broker will be the leader for about 1000 partitions. When this broker failsuncleanly,all those1000 partitions become unavailable at exactly the same time. Suppose that it takes 5 ms to elect a new leader for a single partition. It will take up to 5 seconds to elect the new leader for all 1000 partitions. So, for some partitions, their observed unavailability can be 5 seconds plus the time taken to detect the failure. More Partitions May Increase End-to-end Latency The end-to-end latency in Kafka is defined by the time from when a message is published by the producer to when the message is read by the consumer. Kafka only exposes a message to a consumer after it has been committed, i.e., when the messageis replicated to all the in-sync replicas.So,the time to commit a message can be a significant portion of the end-to-end latency. By default, a Kafka broker only uses a single thread to replicatedata from another broker, for all partitions that share replicas between the two brokers. Our experiments show that replicating 1000 partitions from one broker to another can add about 20 ms latency, which implies that the end-to-end latency is at least 20 ms. This can be too high for some real-time applications. https://de.confluent.io/blog/how-choose-number- topics-partitions-kafka-cluster High-Level Stream DSL – Using the Stream DSL, a user can express transformation, aggregations, grouping, etc. With each transformation, data has to be serialized and written into the topic. Then, for the next operation in the chain, it has to be read from the topic, meaning that all sideoperations happen for every entity (likepartition key calculation, persisting to disk, etc.) Kafka Streams assumes that the Serde class used for serialization or deserialization is the one provided in the config. With changingtheformat of data in the operation chain, the user has to provide the appropriate Serde. If existingSerdes can'thandletheused format, the user has to create a custom Serde. No big deal — justextend the Serde class and implement a custom serializer and deserializer. From one class, we ended up with 4 — not so optimal. So, for each custom format of data in the operation chain, we create three additional classes. An alternative approach is to use generic JSON or AVRO Serdes. One more thing: the user has to specify a Serde for both key and value parts of the message Restarting of the application. After the breakdown, the application will go through each internal topic to the last valid offset, and this can take some time — especially if logcompaction is notused and/or retention period is not set up. https://dzone.com/articles/problem-with-kafka- streams-1 2. Exactly once delivery in Kafka A. exactly-once semantics is introduced in Apache Kafka 0.11 release and Confluent Platform 3.3. https://medium.com/@jaykreps/exactly-once-support- in-apache-kafka-55e1fdd0a35f For Previous versions of kafka we can partially achieve this - Exactly-onceKafka Static Consumer via Assign (One and Only One Message Delivery) Steps
  • 2. Step1 : Set enable.auto.commit = false Step2 : Don’t make call to consumer.commitSync(); after processingmessage. Step3 : Register consumer to specific partition using ‘assign’call. Step 4 : On startup of the consumer seek to specific message offset by calling consumer.seek(topicPartition,offset); Step 5 : While processing the messages, get hold of the offset of each message. Store the processed message’s offset in an atomic way along with the processed message using atomic-transaction. When data is stored in relational database atomicity is easier to implement. For non-relational data-store such as HDFS store or No- SQL store one way to achieve atomicity is as follows: Store the offset along with the message. Step 6 : Implement idempotent as a safety net. https://dzone.com/articles/kafka-clients-at-most-once- at-least-once-exactly-o 3. Why Avro produce and consumer is preferred A. Avro is an open source binary message exchange protocol. Avro helps to send optimized messages across the wirehence reducingthe network overhead. Avro can enforce schema for messages that can be defined using JSON. Avro can generate binding objects in various programming languages from these schemas. Message payloads are automatically bound, and these generate objects on the consumer side.Avro is natively supported and highly recommended to use along with Kafka https://dzone.com/articles/kafka-clients-at-most-once- at-least-once-exactly-o 4. Schema registry A. Kafka takes bytes as an input and sends bytes as an output – No data verification. Obviously, your data has meaning beyond bytes, so your consumers need to parse itand later on interpret it.They mainly occur in thesetwo situations: The field you’re looking for doesn’t exist anymore. The type of the field has changed (e.g. what used to be a String is now an Integer) What are our options to prevent and overcome these issues? Catch exception on parsing errors. Your code becomes ugly and very hard to maintain. 👎 Never ever change the data producer and triple check your producer code will never forget to send a field. That’s what most companies do. But after a few key people quit, all your “safeguards” are gone. 👎👎 Adopt a data format and enforce rules that allowyou to perform schema evolution while guaranteeing not to break your downstream applications. 👏 (Sounds too good to be true ?) - That data format is Apache Avro. Avro is one of the fastest serializable and deserializable data formats. It support schema evolution. The Kafka Avro Serializer The engineering beauty of this architecture is that now, your Producers usea new Serializer,provided courtesy of Confluent, named the KafkaAvroSerializer. Upon producing Avro data to Kafka, the following will happen (simplified version): Your producer will check if the schema is available is in the Schema Registry. If not available, it will register and cache it The Schema Registry will verify if theschema is either the same as before or a valid evolution. If not, it will return an exception and the KafkaAvroSerializer will crash your producer. Better safe than sorry If the schema is valid and all checks pass, the producer will only includea reference to the Schema (the Schema ID) in the message sent to Kafka,not the whole schema. The advantage of this is that now, your messages sent to Kafka are much smaller! https://medium.com/@stephane.maarek/introduction- to-schemas-in-apache-kafka-with-the-confluent- schema-registry-3bf55e401321 5. In-Sync replica (AKA ISR) A. Every topic partition in Kafka isreplicated n times, where n is the replication factor of the topic. This allows Kafka to automatically failover to these replicas when a server in the cluster fails so that messages remain available in the presence of failures.Replication in Kafka happens at the partition granularity where the partition’s write- ahead log is replicated in order to n servers. Out of the n replicas, one replica is designated as the leader while others are followers. As the name suggests, the leader takes the writes from the producer and the followers merely copy the leader’s log in order. When a producer sends a message to the broker, it is written by the leader and replicated to all the partition’s
  • 3. replicas. A message is committed only after it has been successfully copied to all the in-sync replicas. WHAT DOES IT MEAN FOR A REPLICA TO be caughtup to the leader? - replica that has not “caught up” to the leader’s log as possibly being marked as an out-of-sync replica. take an example of a single partition topic foo with a replication factor of 3.Assumethatthe replicas for this partition live on brokers 1, 2 and 3 and that 3 messages have been committed on topic foo. Replica on broker 1 is the current leader and replicas 2 and 3 are followers and all replicasarepartof the ISR. Also assume that replica.lag.max.messages is set to 4 which means that as longas afollower is behind theleader by notmore than 3 messages,itwill notberemoved from the ISR.And replica.lag.time.max.ms is set to 500 ms which means that as long as the followers send a fetch request to the leader every 500 ms or sooner, they will not be marked dead and will not be removed from the ISR. A replica can be out-of-sync with the leader for several reasons-Slow replica: A follower replica that is consistently not able to catch up with the writes on the leader for a certain period of time. One of the most common reasons for this is an I/O bottleneck on the follower replica causing it to append the copied messages at a rate slower than it can consumer from the leader. Stuck replica: A follower replica that has stopped fetching from the leader for a certain period of time. A replica could be stuck either due to a GC pause or because it has failed or died. Bootstrapping replica: When the user increases the replication factor of the topic, the new follower replicas are out-of-sync until they are fully caught up to the leader’s log https://www.confluent.io/blog/hands-free-kafka-replication-a- lesson-in-operational-simplicity/ 6. Kafka log compaction A. In the Kafka cluster,the retention policy can be set on a per-topic basis such as time based, size-based, or log compaction-based. Log compaction retains at least the last known value for each record key for a single topic partition. Compacted logs are useful for restoring state after a crash or system failure. Other important usecases is CDC (change data capture) Kafka logcompaction also allows for deletes. A message with a key and a null payload acts like a tombstone, a delete marker for that key. Tombstones get cleared after a period. Log compaction periodically runs in the background by recopying log segments. Compaction does not block reads and can be throttled to avoid impacting I/O of producers and consumers. The Kafka Log Cleaner does log compaction. The Log cleaner has a pool of background compaction threads. These threads recopy log segment files, removing older records whose key reappears recently in the log. Topic config min.compaction.lag.ms gets used to guarantee a minimum period that must pass before a message can be compacted. The consumer sees all
  • 4. tombstones as long as the consumer reaches head of a log in a period less than the topic config delete.retention.ms (the default is 24 hours). Log compaction will never re-order messages, just remove some. Partition offset for a message never changes. http://cloudurable.com/blog/kafka-architecture-log- compaction/index.html 7. kafka record keys for partition Strategy A. Key is an optional metadata,thatcan besent with a Kafka message, and by default, itis used to route message to a specific partition.E.g. if you're sendinga message m with key as k, to a topic mytopic that has p partitions,then m goes to the partition Hash(k) % p in mytopic. It has no connection to the offset of a partition whatsoever. Offsets are used by consumers to keep track of the position of last read message in a partition. If a PartitionKeyStrategy is used with a topic,the valueis used as the message key, and is then implicitly used to select the partition according to the default behavior of the Kafka client: If a valid partition number is specified that partition will be used when sending the record. If no partition is specified but a key is present a partition will be chosen using a hash of the key. If neither key nor partition is present a partition will be assigned in a round-robin fashion. It might be desirable in some cases to control these independently. For example, you might wish to have a message key that is more fine-grained than the partition key, for use with Kafka log compaction on sub-graphs of the entity state. https://blog.newrelic.com/engineering/effective- strategies-kafka-topic-partitioning/ https://stackoverflow.com/questions/51245962/how- to-choose-a-key-and-offset-for-a-kafka-producer 8. kafka Mirror Maker A. Kafka's mirroringfeaturemakes itpossibleto maintain a replica of an existing Kafka cluster. Mirror Maker is just a regular Java Producer/Consumer pair. Data is read from topics in the origin cluster and written to a topic with the same name in the destination cluster. You can run many such mirroring processes to increase throughput and for fault-tolerance (if one process dies, the others will take overs the additional load). Alternative is Confluent Replicator. Confluent Replicator is a more complete solution that handles topic configuration as well as data and integrates with Kafka Connect and Control Center to improve availability, scalability and ease of use. https://docs.confluent.io/current/multi-dc- replicator/mirrormaker.html 9. Challenges faced while working with Kafka A. Duplicates – at least once behavior – Remove the duplicates by checkingwith previous persisted messages or implement exactly once behavior. Consumer liveness – if a consumer take more time to process the message. The group coordinator expects group members to send it regular heartbeats to indicate that they remain active. A background heartbeat thread runs in the consumer sending regular heartbeats to the coordinator. If the coordinator does not receive a heartbeat from a group member within the session timeout, the coordinator removes the member from the group and starts a rebalance of the group. The session timeout can be much shorter than the maximum polling interval so that the time taken to detect a failed consumer can be shorteven if message processingtakes a long time. You can configurethe maximumpollinginterval usingthe max.poll.interval.ms property and the session timeout usingthe session.timeout.ms property. You will typically not need to use these settings unless ittakes more than 5 minutes to process a batch of messages. If you have problems with message handling caused by message flooding, you can set a consumer option to control the speed of message consumption. Use fetch.max.bytes and max.poll.records to control how much data a call to poll() can return. https://console.bluemix.net/docs/services/EventStream s/eventstreams114.html#consuming_messages