SlideShare une entreprise Scribd logo
1  sur  39
When One Data
Center Is Not Enough
Building Large-scale Stream Infrastructures Across
Multiple Data Centers
with Apache Kafka
Gwen Shapira
There’s a book on that!
Actually… a chapter
Outline
Kafka overview
Common multi data center patterns
Future stuff
What is Kafka?
▪ It’s like a message queue, right?
-Actually, it’s a “distributed commit log”
-Or “streaming data platform”
0 1 2 3 4 5 6 7 8
Data
Source
Data
Consumer
A
Data
Consumer
B
Topics and Partitions
▪ Messages are organized into topics, and each topic is split into partitions.
- Each partition is an immutable, time-sequenced log of messages on disk.
- Note that time ordering is guaranteed within, but not across, partitions.
0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 8
Partition 0
Partition 1
Partition 2
Data
Source
Topic
Scalable consumption model
Topic T1
Partition 0
Partition 1
Partition 2
Partition 3
Consumer Group 1
Consumer 1
Topic T1
Partition 0
Partition 1
Partition 2
Partition 3
Consumer Group 1
Consumer 1
Consumer 2
Consumer 3
Consumer 4
Kafka usage
Common use case
Large scale real time data integration
Other use cases
Scaling databases
Messaging
Stream processing
…
Important things to remember:
1. Consumers offset commits
2. Within a cluster – each partition has replicas
3. Inter-cluster replication, producer and consumer defaults – all tuned for LAN
Why multiple data centers (DC)
Offload work from main cluster
Disaster recovery
Geo-localization
• Saving cross-DC bandwidth
• Better performance by being closer to users
• Some activity is just local
• Security / regulations
Cloud
Special case: Producers with network issues
Why is this difficult?
1. It isn’t, really – you consume data from one cluster and produce to another
2. Network between two data centers can get tricky
3. Consumers have state (offsets) – syncing this between clusters get tough
• And leads to some counter intuitive results
Pattern #1: stretched cluster
Typically done on AWS in a single region
• Deploy Zookeeper and broker across 3 availability zones
Rely on intra-cluster replication to replica data across DCs
Kafka
producers
consumer
s
DC 1 DC 3DC 2
producersproducers
consumer
s
consumer
s
On DC failure
Producer/consumer fail over to new DCs
• Existing data preserved by intra-cluster replication
• Consumer resumes from last committed offsets and will see same data
Kafka
producers
consumer
s
DC 1 DC 3DC 2
producers
consumer
s
When DC comes back
Intra cluster replication auto re-replicates all missing data
When re-replication completes, switch producer/consumer back
Kafka
producers
consumer
s
DC 1 DC 3DC 2
producersproducers
consumer
s
consumer
s
Be careful with replica assignment
Don’t want all replicas in same AZ
Rack-aware support in 0.10.0
• Configure brokers in same AZ with same broker.rack
Manual assignment pre 0.10.0
Stretched cluster NOT recommended across regions
Asymmetric network partitioning
Longer network latency => longer produce/consume time
Cross region bandwidth: no read affinity in Kafka
region 1
Kafk
a
ZK
region 2
Kafk
a
ZK
region 3
Kafk
a
ZK
Pattern #2: active/passive
Producers in active DC
Consumers in either active or passive DC
Kafka
producers
consumer
s
DC 1
Replication
DC 2
Kafka
consumer
s
Critical
Apps
Nice
Reports
Cross Datacenter Replication
Consumer & Producer: read from a source cluster and write to a target cluster
Per-key ordering preserved
Asynchronous: target always slightly behind
Offsets not preserved
• Source and target may not have same # partitions
• Retries for failed writes
Options:
• Confluent Multi-Datacenter Replication
• MirrorMaker
On active DC failure
Fail over producers/consumers to passive cluster
Challenge: which offset to resume consumption
• Offsets not identical across clusters
Kafka
producers
consumer
s
DC 1
Replication
DC 2
Kafka
Solutions for switching consumers
Resume from smallest offset
• Duplicates
Resume from largest offset
• May miss some messages (likely acceptable for real time consumers)
Replicate offsets topic
• May miss some messages, may get duplicates
Set offset based on timestamp
• Old API hard to use and not precise
• Better and more precise API in Apache Kafka 0.10.1 (Confluent 3.1)
• Nice tool coming up!
Preserve offsets during replication
• Harder to do
When DC comes back
Need to reverse replication
• Same challenge: determining the offsets
Kafka
producers
consumer
s
DC 1
Replication
DC 2
Kafka
Limitations
Reconfiguration of replication after failover
Resources in passive DC under utilized
Pattern #3: active/active
Local  aggregate replication to avoid cycles
Producers/consumers in both DCs
• Producers only write to local clusters
Kafka
local
Kafka
aggregat
e
Kafka
aggregat
e
producers producer
s
consumer
s
consumer
s
Replication
Kafka
local
DC 1 DC 2
consumer
s
consumer
s
On DC failure
Same challenge on moving consumers on aggregate cluster
• Offsets in the 2 aggregate cluster not identical
• Unless the consumers are continuously running in both clusters
Kafka
local
Kafka
aggregat
e
Kafka
aggregat
e
producers producer
s
consumer
s
consumer
s
Replication
Kafka
local
DC 1 DC 2
consumer
s
consumer
s
SF
Kafka
Cluster
Houston
Kafka
Cluster
All
apps
All
apps
West coast
Users
South Central
Users
When DC comes back
No need to reconfigure replication
Kafka
local
Kafka
aggregat
e
Kafka
aggregat
e
producers producer
s
consumer
s
consumer
s
Replication
Kafka
local
DC 1 DC 2
consumer
s
consumer
s
Alternative: avoid aggregate clusters
Prefix topic names with DC tag
Configure replication to replicate remote topics only
Consumers need to subscribe to topics with both DC tags
Kafka
producers
consumers
DC 1
Replication
DC 2
Kafka
producers
consumers
Beyond 2 DCs
More DCs  better resource utilization
• With 2 DCs, each DC needs to provision 100% traffic
• With 3 DCs, each DC only needs to provision 50% traffic
Setting up replication with many DCs can be daunting
• Only set up aggregate clusters in 2-3
Comparison
Pros Cons
Stretched • Better utilization of resources
• Easy failover for consumers
• Still need cross region story
Active/passive • Needed for global ordering • Harder failover for consumers
• Reconfiguration during failover
• Resource under-utilization
Active/active • Better utilization of resources
• Can be used to avoid
consumer failover
• Can be challenging to manage
• More replication bandwidth
Multi-DC beyond Kafka
Kafka often used together with other data stores
Need to make sure multi-DC strategy is consistent
Example application
Consumer reads from Kafka and computes 1-min count
Counts need to be stored in DB and available in every DC
Independent database per DC
Run same consumer concurrently in both DCs
• No consumer failover needed
Kafka
local
Kafka
aggregat
e
Kafka
aggregat
e
producers producer
s
consumer consumer
Replication
Kafka
local
DC 1 DC 2
DB DB
Stretched database across DCs
Only run one consumer per DC at any given point of time
Kafka
local
Kafka
aggregat
e
Kafka
aggregat
e
producers producer
s
consumer consumer
Replication
Kafka
local
DC 1 DC 2
DB DB
on
failover
Practical tips
• Consume remote, produce local
• Unless you need encrypted data on the wire
• Monitor!
• Burrow for replication lag
• Confluent Control Center for end-to-end
• JMX metrics for rates and “busy-ness”
• Tune!
• Producer / Consumer tuning
• Number of consumers, producers
• TCP tuning for WAN
• Don’t forget to replicate configuration
• Separate critical topics from nice-to-have topics
Future work
Offset reset tool
Offset preservation
“Remote Replicas”
2-DC stretch cluster
Other cool Kafka future:
• Exactly Once
• Transactions
• Headers
THANK YOU!
Gwen Shapira| gwen@confluent.io | @gwenshap
Kafka Training with Confluent University
• Kafka Developer and Operations Courses
• Visit www.confluent.io/training
Want more Kafka?
• Download Confluent Platform Enterprise at http://www.confluent.io/product
• Apache Kafka 0.10.2 upgrade documentation at http://docs.confluent.io/3.2.0/upgrade.html
• Kafka Summit recordings now available at http://kafka-summit.org/schedule/
Discount code: kafstrata
Special Strata Attendee discount code = 25% off
www.kafka-summit.org
Kafka Summit New York: May 8
Kafka Summit San Francisco: August 28
Presented by

Contenu connexe

Tendances

Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
SANG WON PARK
 

Tendances (20)

Apache kafka performance(throughput) - without data loss and guaranteeing dat...
Apache kafka performance(throughput) - without data loss and guaranteeing dat...Apache kafka performance(throughput) - without data loss and guaranteeing dat...
Apache kafka performance(throughput) - without data loss and guaranteeing dat...
 
Apache Kafka Best Practices
Apache Kafka Best PracticesApache Kafka Best Practices
Apache Kafka Best Practices
 
Apache Kafka at LinkedIn
Apache Kafka at LinkedInApache Kafka at LinkedIn
Apache Kafka at LinkedIn
 
NATS Streaming - an alternative to Apache Kafka?
NATS Streaming - an alternative to Apache Kafka?NATS Streaming - an alternative to Apache Kafka?
NATS Streaming - an alternative to Apache Kafka?
 
Kafka: Internals
Kafka: InternalsKafka: Internals
Kafka: Internals
 
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
 
Securing Kafka
Securing Kafka Securing Kafka
Securing Kafka
 
Improving Kafka at-least-once performance at Uber
Improving Kafka at-least-once performance at UberImproving Kafka at-least-once performance at Uber
Improving Kafka at-least-once performance at Uber
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controller
 
Practical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobsPractical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobs
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
How to tune Kafka® for production
How to tune Kafka® for productionHow to tune Kafka® for production
How to tune Kafka® for production
 
Hardening Kafka Replication
Hardening Kafka Replication Hardening Kafka Replication
Hardening Kafka Replication
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
 
Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka Introduction
 
Disaster Recovery and High Availability with Kafka, SRM and MM2
Disaster Recovery and High Availability with Kafka, SRM and MM2Disaster Recovery and High Availability with Kafka, SRM and MM2
Disaster Recovery and High Availability with Kafka, SRM and MM2
 
Apache kafka 관리와 모니터링
Apache kafka 관리와 모니터링Apache kafka 관리와 모니터링
Apache kafka 관리와 모니터링
 
Kafka at scale facebook israel
Kafka at scale   facebook israelKafka at scale   facebook israel
Kafka at scale facebook israel
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
 

Similaire à Multi-Datacenter Kafka - Strata San Jose 2017

Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Erik Onnen
 
Kafka Cluster Federation at Uber (Yupeng Fui & Xiaoman Dong, Uber) Kafka Summ...
Kafka Cluster Federation at Uber (Yupeng Fui & Xiaoman Dong, Uber) Kafka Summ...Kafka Cluster Federation at Uber (Yupeng Fui & Xiaoman Dong, Uber) Kafka Summ...
Kafka Cluster Federation at Uber (Yupeng Fui & Xiaoman Dong, Uber) Kafka Summ...
confluent
 
M6d cassandrapresentation
M6d cassandrapresentationM6d cassandrapresentation
M6d cassandrapresentation
Edward Capriolo
 

Similaire à Multi-Datacenter Kafka - Strata San Jose 2017 (20)

Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
 
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
 
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
 
Distributed messaging with Apache Kafka
Distributed messaging with Apache KafkaDistributed messaging with Apache Kafka
Distributed messaging with Apache Kafka
 
Tuning kafka pipelines
Tuning kafka pipelinesTuning kafka pipelines
Tuning kafka pipelines
 
Kafka Cluster Federation at Uber (Yupeng Fui & Xiaoman Dong, Uber) Kafka Summ...
Kafka Cluster Federation at Uber (Yupeng Fui & Xiaoman Dong, Uber) Kafka Summ...Kafka Cluster Federation at Uber (Yupeng Fui & Xiaoman Dong, Uber) Kafka Summ...
Kafka Cluster Federation at Uber (Yupeng Fui & Xiaoman Dong, Uber) Kafka Summ...
 
Fundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaFundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache Kafka
 
kafka for db as postgres
kafka for db as postgreskafka for db as postgres
kafka for db as postgres
 
Unleashing Real-time Power with Kafka.pptx
Unleashing Real-time Power with Kafka.pptxUnleashing Real-time Power with Kafka.pptx
Unleashing Real-time Power with Kafka.pptx
 
Building Stream Infrastructure across Multiple Data Centers with Apache Kafka
Building Stream Infrastructure across Multiple Data Centers with Apache KafkaBuilding Stream Infrastructure across Multiple Data Centers with Apache Kafka
Building Stream Infrastructure across Multiple Data Centers with Apache Kafka
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Availability of Kafka - Beyond the Brokers | Andrew Borley and Emma Humber, IBM
Availability of Kafka - Beyond the Brokers | Andrew Borley and Emma Humber, IBMAvailability of Kafka - Beyond the Brokers | Andrew Borley and Emma Humber, IBM
Availability of Kafka - Beyond the Brokers | Andrew Borley and Emma Humber, IBM
 
Westpac Bank Tech Talk 1: Dive into Apache Kafka
Westpac Bank Tech Talk 1: Dive into Apache KafkaWestpac Bank Tech Talk 1: Dive into Apache Kafka
Westpac Bank Tech Talk 1: Dive into Apache Kafka
 
Streaming in Practice - Putting Apache Kafka in Production
Streaming in Practice - Putting Apache Kafka in ProductionStreaming in Practice - Putting Apache Kafka in Production
Streaming in Practice - Putting Apache Kafka in Production
 
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache Kafka
 
M6d cassandrapresentation
M6d cassandrapresentationM6d cassandrapresentation
M6d cassandrapresentation
 
Stateful streaming and the challenge of state
Stateful streaming and the challenge of stateStateful streaming and the challenge of state
Stateful streaming and the challenge of state
 
Building an Event Bus at Scale
Building an Event Bus at ScaleBuilding an Event Bus at Scale
Building an Event Bus at Scale
 
RabbitMQ vs Apache Kafka - Part 1
RabbitMQ vs Apache Kafka - Part 1RabbitMQ vs Apache Kafka - Part 1
RabbitMQ vs Apache Kafka - Part 1
 

Plus de Gwen (Chen) Shapira

Scaling ETL with Hadoop - Avoiding Failure
Scaling ETL with Hadoop - Avoiding FailureScaling ETL with Hadoop - Avoiding Failure
Scaling ETL with Hadoop - Avoiding Failure
Gwen (Chen) Shapira
 
Intro to Spark - for Denver Big Data Meetup
Intro to Spark - for Denver Big Data MeetupIntro to Spark - for Denver Big Data Meetup
Intro to Spark - for Denver Big Data Meetup
Gwen (Chen) Shapira
 

Plus de Gwen (Chen) Shapira (20)

Velocity 2019 - Kafka Operations Deep Dive
Velocity 2019  - Kafka Operations Deep DiveVelocity 2019  - Kafka Operations Deep Dive
Velocity 2019 - Kafka Operations Deep Dive
 
Lies Enterprise Architects Tell - Data Day Texas 2018 Keynote
Lies Enterprise Architects Tell - Data Day Texas 2018  Keynote Lies Enterprise Architects Tell - Data Day Texas 2018  Keynote
Lies Enterprise Architects Tell - Data Day Texas 2018 Keynote
 
Gluecon - Kafka and the service mesh
Gluecon - Kafka and the service meshGluecon - Kafka and the service mesh
Gluecon - Kafka and the service mesh
 
Papers we love realtime at facebook
Papers we love   realtime at facebookPapers we love   realtime at facebook
Papers we love realtime at facebook
 
Kafka reliability velocity 17
Kafka reliability   velocity 17Kafka reliability   velocity 17
Kafka reliability velocity 17
 
Streaming Data Integration - For Women in Big Data Meetup
Streaming Data Integration - For Women in Big Data MeetupStreaming Data Integration - For Women in Big Data Meetup
Streaming Data Integration - For Women in Big Data Meetup
 
Kafka connect-london-meetup-2016
Kafka connect-london-meetup-2016Kafka connect-london-meetup-2016
Kafka connect-london-meetup-2016
 
Fraud Detection for Israel BigThings Meetup
Fraud Detection  for Israel BigThings MeetupFraud Detection  for Israel BigThings Meetup
Fraud Detection for Israel BigThings Meetup
 
Kafka Reliability - When it absolutely, positively has to be there
Kafka Reliability - When it absolutely, positively has to be thereKafka Reliability - When it absolutely, positively has to be there
Kafka Reliability - When it absolutely, positively has to be there
 
Nyc kafka meetup 2015 - when bad things happen to good kafka clusters
Nyc kafka meetup 2015 - when bad things happen to good kafka clustersNyc kafka meetup 2015 - when bad things happen to good kafka clusters
Nyc kafka meetup 2015 - when bad things happen to good kafka clusters
 
Fraud Detection Architecture
Fraud Detection ArchitectureFraud Detection Architecture
Fraud Detection Architecture
 
Have your cake and eat it too
Have your cake and eat it tooHave your cake and eat it too
Have your cake and eat it too
 
Kafka for DBAs
Kafka for DBAsKafka for DBAs
Kafka for DBAs
 
Data Architectures for Robust Decision Making
Data Architectures for Robust Decision MakingData Architectures for Robust Decision Making
Data Architectures for Robust Decision Making
 
Kafka and Hadoop at LinkedIn Meetup
Kafka and Hadoop at LinkedIn MeetupKafka and Hadoop at LinkedIn Meetup
Kafka and Hadoop at LinkedIn Meetup
 
Kafka & Hadoop - for NYC Kafka Meetup
Kafka & Hadoop - for NYC Kafka MeetupKafka & Hadoop - for NYC Kafka Meetup
Kafka & Hadoop - for NYC Kafka Meetup
 
Twitter with hadoop for oow
Twitter with hadoop for oowTwitter with hadoop for oow
Twitter with hadoop for oow
 
R for hadoopers
R for hadoopersR for hadoopers
R for hadoopers
 
Scaling ETL with Hadoop - Avoiding Failure
Scaling ETL with Hadoop - Avoiding FailureScaling ETL with Hadoop - Avoiding Failure
Scaling ETL with Hadoop - Avoiding Failure
 
Intro to Spark - for Denver Big Data Meetup
Intro to Spark - for Denver Big Data MeetupIntro to Spark - for Denver Big Data Meetup
Intro to Spark - for Denver Big Data Meetup
 

Dernier

+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
Health
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
chadhar227
 
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
HyderabadDolls
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
gajnagarg
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
gajnagarg
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
gajnagarg
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
gajnagarg
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
HyderabadDolls
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Klinik kandungan
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
ranjankumarbehera14
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
HyderabadDolls
 

Dernier (20)

Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbers
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
 
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
Kings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about themKings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about them
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
 

Multi-Datacenter Kafka - Strata San Jose 2017

  • 1. When One Data Center Is Not Enough Building Large-scale Stream Infrastructures Across Multiple Data Centers with Apache Kafka Gwen Shapira
  • 2. There’s a book on that! Actually… a chapter
  • 3. Outline Kafka overview Common multi data center patterns Future stuff
  • 4. What is Kafka? ▪ It’s like a message queue, right? -Actually, it’s a “distributed commit log” -Or “streaming data platform” 0 1 2 3 4 5 6 7 8 Data Source Data Consumer A Data Consumer B
  • 5. Topics and Partitions ▪ Messages are organized into topics, and each topic is split into partitions. - Each partition is an immutable, time-sequenced log of messages on disk. - Note that time ordering is guaranteed within, but not across, partitions. 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 Partition 0 Partition 1 Partition 2 Data Source Topic
  • 6. Scalable consumption model Topic T1 Partition 0 Partition 1 Partition 2 Partition 3 Consumer Group 1 Consumer 1 Topic T1 Partition 0 Partition 1 Partition 2 Partition 3 Consumer Group 1 Consumer 1 Consumer 2 Consumer 3 Consumer 4
  • 8. Common use case Large scale real time data integration
  • 9. Other use cases Scaling databases Messaging Stream processing …
  • 10. Important things to remember: 1. Consumers offset commits 2. Within a cluster – each partition has replicas 3. Inter-cluster replication, producer and consumer defaults – all tuned for LAN
  • 11. Why multiple data centers (DC) Offload work from main cluster Disaster recovery Geo-localization • Saving cross-DC bandwidth • Better performance by being closer to users • Some activity is just local • Security / regulations Cloud Special case: Producers with network issues
  • 12. Why is this difficult? 1. It isn’t, really – you consume data from one cluster and produce to another 2. Network between two data centers can get tricky 3. Consumers have state (offsets) – syncing this between clusters get tough • And leads to some counter intuitive results
  • 13. Pattern #1: stretched cluster Typically done on AWS in a single region • Deploy Zookeeper and broker across 3 availability zones Rely on intra-cluster replication to replica data across DCs Kafka producers consumer s DC 1 DC 3DC 2 producersproducers consumer s consumer s
  • 14. On DC failure Producer/consumer fail over to new DCs • Existing data preserved by intra-cluster replication • Consumer resumes from last committed offsets and will see same data Kafka producers consumer s DC 1 DC 3DC 2 producers consumer s
  • 15. When DC comes back Intra cluster replication auto re-replicates all missing data When re-replication completes, switch producer/consumer back Kafka producers consumer s DC 1 DC 3DC 2 producersproducers consumer s consumer s
  • 16. Be careful with replica assignment Don’t want all replicas in same AZ Rack-aware support in 0.10.0 • Configure brokers in same AZ with same broker.rack Manual assignment pre 0.10.0
  • 17. Stretched cluster NOT recommended across regions Asymmetric network partitioning Longer network latency => longer produce/consume time Cross region bandwidth: no read affinity in Kafka region 1 Kafk a ZK region 2 Kafk a ZK region 3 Kafk a ZK
  • 18. Pattern #2: active/passive Producers in active DC Consumers in either active or passive DC Kafka producers consumer s DC 1 Replication DC 2 Kafka consumer s Critical Apps Nice Reports
  • 19. Cross Datacenter Replication Consumer & Producer: read from a source cluster and write to a target cluster Per-key ordering preserved Asynchronous: target always slightly behind Offsets not preserved • Source and target may not have same # partitions • Retries for failed writes Options: • Confluent Multi-Datacenter Replication • MirrorMaker
  • 20. On active DC failure Fail over producers/consumers to passive cluster Challenge: which offset to resume consumption • Offsets not identical across clusters Kafka producers consumer s DC 1 Replication DC 2 Kafka
  • 21. Solutions for switching consumers Resume from smallest offset • Duplicates Resume from largest offset • May miss some messages (likely acceptable for real time consumers) Replicate offsets topic • May miss some messages, may get duplicates Set offset based on timestamp • Old API hard to use and not precise • Better and more precise API in Apache Kafka 0.10.1 (Confluent 3.1) • Nice tool coming up! Preserve offsets during replication • Harder to do
  • 22. When DC comes back Need to reverse replication • Same challenge: determining the offsets Kafka producers consumer s DC 1 Replication DC 2 Kafka
  • 23. Limitations Reconfiguration of replication after failover Resources in passive DC under utilized
  • 24. Pattern #3: active/active Local  aggregate replication to avoid cycles Producers/consumers in both DCs • Producers only write to local clusters Kafka local Kafka aggregat e Kafka aggregat e producers producer s consumer s consumer s Replication Kafka local DC 1 DC 2 consumer s consumer s
  • 25. On DC failure Same challenge on moving consumers on aggregate cluster • Offsets in the 2 aggregate cluster not identical • Unless the consumers are continuously running in both clusters Kafka local Kafka aggregat e Kafka aggregat e producers producer s consumer s consumer s Replication Kafka local DC 1 DC 2 consumer s consumer s
  • 27. When DC comes back No need to reconfigure replication Kafka local Kafka aggregat e Kafka aggregat e producers producer s consumer s consumer s Replication Kafka local DC 1 DC 2 consumer s consumer s
  • 28. Alternative: avoid aggregate clusters Prefix topic names with DC tag Configure replication to replicate remote topics only Consumers need to subscribe to topics with both DC tags Kafka producers consumers DC 1 Replication DC 2 Kafka producers consumers
  • 29.
  • 30. Beyond 2 DCs More DCs  better resource utilization • With 2 DCs, each DC needs to provision 100% traffic • With 3 DCs, each DC only needs to provision 50% traffic Setting up replication with many DCs can be daunting • Only set up aggregate clusters in 2-3
  • 31. Comparison Pros Cons Stretched • Better utilization of resources • Easy failover for consumers • Still need cross region story Active/passive • Needed for global ordering • Harder failover for consumers • Reconfiguration during failover • Resource under-utilization Active/active • Better utilization of resources • Can be used to avoid consumer failover • Can be challenging to manage • More replication bandwidth
  • 32. Multi-DC beyond Kafka Kafka often used together with other data stores Need to make sure multi-DC strategy is consistent
  • 33. Example application Consumer reads from Kafka and computes 1-min count Counts need to be stored in DB and available in every DC
  • 34. Independent database per DC Run same consumer concurrently in both DCs • No consumer failover needed Kafka local Kafka aggregat e Kafka aggregat e producers producer s consumer consumer Replication Kafka local DC 1 DC 2 DB DB
  • 35. Stretched database across DCs Only run one consumer per DC at any given point of time Kafka local Kafka aggregat e Kafka aggregat e producers producer s consumer consumer Replication Kafka local DC 1 DC 2 DB DB on failover
  • 36. Practical tips • Consume remote, produce local • Unless you need encrypted data on the wire • Monitor! • Burrow for replication lag • Confluent Control Center for end-to-end • JMX metrics for rates and “busy-ness” • Tune! • Producer / Consumer tuning • Number of consumers, producers • TCP tuning for WAN • Don’t forget to replicate configuration • Separate critical topics from nice-to-have topics
  • 37. Future work Offset reset tool Offset preservation “Remote Replicas” 2-DC stretch cluster Other cool Kafka future: • Exactly Once • Transactions • Headers
  • 38. THANK YOU! Gwen Shapira| gwen@confluent.io | @gwenshap Kafka Training with Confluent University • Kafka Developer and Operations Courses • Visit www.confluent.io/training Want more Kafka? • Download Confluent Platform Enterprise at http://www.confluent.io/product • Apache Kafka 0.10.2 upgrade documentation at http://docs.confluent.io/3.2.0/upgrade.html • Kafka Summit recordings now available at http://kafka-summit.org/schedule/
  • 39. Discount code: kafstrata Special Strata Attendee discount code = 25% off www.kafka-summit.org Kafka Summit New York: May 8 Kafka Summit San Francisco: August 28 Presented by

Notes de l'éditeur

  1. We earlier described Kafka as a pub-sub system, so it’s tempting to compare it to a message queue system, but the semantics are actually different from a traditional messaging service – it’s more accurate to call it a distributed commit log. Kafka was created by LinkedIn to address scalability and integration challenges they had with traditional systems. Kafka is highly scalable, fault-tolerant, high throughput, low latency, and very flexible.
  2. Note that each partition is just a log on disk – an immutable, time-ordered sequence of messages. This goes back to why we classify this layer as “storage”. Messages are “committed” once written to all partitions.
  3. Zookeeper used to detect failures. Communication between 1 & 3 gets disconnected. All brokers can be registered in the ZK instance in region 2. They all appear alive. Can assign leaders in all 3 regions. If we assign a leader into region 1, we then won’t be able to replicate from the leader in region 1 to region 3, which will block data from becoming accessible to consumers. Assymetric between how we detect failures and the actual network state.
  4. Disjoint set of topics Consumers need to adjust their subscription. Can impact all applications.
  5. With 2, need 100% of traffic supported in both to support the failure case of losing 1 DC. With 3, each DC only needs to handle extra traffic from 1. As you add more, get an N^2 pipeline, becomes daunting. Don’t need to do aggregation in all N. Generally ok to choose just 2 or 3 DCs to do aggregation. Reduces overhead.
  6. Need to think holistically. Need to make sure all data stores use a consistent strategy.
  7. Run independent consumers with independent databases. Easy for consumers because failures doesn’t require any coordination. But are paying full 2x costs for storage in your DB and computation.
  8. With a stretched database, you need to run only one consumer at a time, so on failover you need to coordinate the consumers, moving from one DC to the other.