Resilient Kafka: How DNS Traffic Management and Client Wrappers Ensure Availability

Vanessa Vuibert
Sta
ff
Production Engineer
Resilient Ka
f
ka: How DNS Tra
ff
ic Management
and Client Wrappers Ensure Availability
@V3_XD
862 14
Scale
Ka
f
ka brokers Ka
f
ka clusters
14M 9
Messages per sec GCP Regions
@V3_XD
• Maintenance
• Incidents
• Regionalize tra
ff
ic
Tra
ff
ic management use cases
Kubernetes (K8s) out of the box
🔓open source
Kafka broker
K8s out of the box
dig +short service.namespace.svc.cluster.local
IP0
IP1
IP2
K8s out of the box
bootstrap.servers=
service.namespace.svc.cluster.local:9092
K8s out of the box
dig +short pod2.service.namespace.svc.cluster.local
IP2
K8s out of the box
advertised.listeners=
pod2.service.namespace.svc.cluster.local:9092
• Readiness
• Startup
• Liveness
K8s StatefulSet: probes
dig +short service.namespace.svc.cluster.local
IP0
IP2
K8s readiness probe
dig +short service.namespace.svc.cluster.local
IP0
IP2
IP3
K8s readiness probe
not ready
publishNotReadyAddresses: true
Regional pairs
External tra
ff
ic: load balancers
External tra
ff
ic: load balancers
bootstrap.servers
External tra
ff
ic: load balancers
advertised.listeners
• Issues scaling
• Manual broker DNS
records
• Limited tra
ff
ic
control
Built automation with
k8s controllers.
Stateful buddy: load balancers
🔒closed source
Name buddy: DNS records
🔒closed source
Ka
f
ka access buddy: endpoints
🔒closed source
Ka
f
ka Access Buddy: consumer
Ka
f
ka Access Buddy: producer failover
east
- Elasticsearch on call
“Let me failover real quick.”
Faster failovers with a
DNS tra
ff
ic manager.
DNS tra
ff
ic manager
🔒closed source
DNS tra
ff
ic manager: normal
dig +short us-east1.somedomain.com
US-East1-IP
DNS tra
ff
ic manager: failover
dig +short us-east1.somedomain.com
US-Central1-IP
- A Ka
f
ka client
“DNS trickery.”
used to take
40
Minutes
now only takes
1
Minutes
Failover time savings
@V3_XD
Incident during
fl
ashsale
Failover during
fl
ashsale
US Central1 -> US East1
Reduced toil with
client wrappers.
• Failover reconnection
• Everything needed for connection
• Ruby, go and python
Client wrappers
K8s Deployment template: bootstrap.servers
K8s Deployment template: client ID
K8s Deployment template
Improved availability
with local consumers.
• More availability
• Reduced latency
• Reduced storage costs
• Reduced network costs
Local consumers
Aggregate consumer
Local consumers
Local consumers: DNS records
Aggregate
500
ms
Regional
20
ms
Latency 99th
@V3_XD
Connect directly
through private IPs.
• More secure
• Reduced network costs
• Fetch from closest replica: KIP
-
392
Public to private tra
ff
ic
Tra
ff
ic manager: pod IPs
Reduction
-6%
bill
Network represents
29%
bill
Network cost reduction
@V3_XD
• GKE 1.24 -> 1.25
incident
• Apply
f
irewall rules
• LB more secure for
public tra
ff
ic
Failover: pod IPs
Single stop shop with Multi-
Cluster Services (MCS).
MCS endpoints
🔒closed source
Tra
ff
ic sources
Regional pairs: uneven distribution
Regionalize tra
ff
ic: Ka
f
ka access buddy
east
Regionalize tra
ff
ic: MCS
40 18
MCS time savings
Minutes to regionalize tra
ff
ic Minutes to deploy
1 13
Minutes after migration Minutes after migration
@V3_XD
Resilient Kafka: How DNS Traffic Management and Client Wrappers Ensure Availability
• Resiliency: DNS
tra
ff
ic management
• Toil: client wrappers
• Availability: local
consumption
Thanks!
@V3_XD
1 sur 58

Recommandé

Keystone - ApacheCon 2016 par
Keystone - ApacheCon 2016Keystone - ApacheCon 2016
Keystone - ApacheCon 2016Peter Bakas
301 vues75 diapositives
Capital One Delivers Risk Insights in Real Time with Stream Processing par
Capital One Delivers Risk Insights in Real Time with Stream ProcessingCapital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream Processingconfluent
1.6K vues53 diapositives
From Three Nines to Five Nines - A Kafka Journey par
From Three Nines to Five Nines - A Kafka JourneyFrom Three Nines to Five Nines - A Kafka Journey
From Three Nines to Five Nines - A Kafka JourneyAllen (Xiaozhong) Wang
1.4K vues39 diapositives
Accelerated SDN in Azure par
Accelerated SDN in AzureAccelerated SDN in Azure
Accelerated SDN in AzureOpen Networking Summit
712 vues25 diapositives
LF_DPDK17_OpenNetVM: A high-performance NFV platforms to meet future communic... par
LF_DPDK17_OpenNetVM: A high-performance NFV platforms to meet future communic...LF_DPDK17_OpenNetVM: A high-performance NFV platforms to meet future communic...
LF_DPDK17_OpenNetVM: A high-performance NFV platforms to meet future communic...LF_DPDK
282 vues23 diapositives
Cloud Native SDN par
Cloud Native SDNCloud Native SDN
Cloud Native SDNRomana Project
1.9K vues17 diapositives

Contenu connexe

Similaire à Resilient Kafka: How DNS Traffic Management and Client Wrappers Ensure Availability

Uber Real Time Data Analytics par
Uber Real Time Data AnalyticsUber Real Time Data Analytics
Uber Real Time Data AnalyticsAnkur Bansal
2.4K vues71 diapositives
In Flux Limiting for a multi-tenant logging service par
In Flux Limiting for a multi-tenant logging serviceIn Flux Limiting for a multi-tenant logging service
In Flux Limiting for a multi-tenant logging serviceDataWorks Summit/Hadoop Summit
1.4K vues15 diapositives
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015 par
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015Monal Daxini
1.2K vues96 diapositives
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022 par
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022HostedbyConfluent
749 vues27 diapositives
DNS Survival Guide. par
DNS Survival Guide.DNS Survival Guide.
DNS Survival Guide.Qrator Labs
102 vues53 diapositives
DNS Survival Guide par
DNS Survival GuideDNS Survival Guide
DNS Survival GuideAPNIC
403 vues53 diapositives

Similaire à Resilient Kafka: How DNS Traffic Management and Client Wrappers Ensure Availability(20)

Uber Real Time Data Analytics par Ankur Bansal
Uber Real Time Data AnalyticsUber Real Time Data Analytics
Uber Real Time Data Analytics
Ankur Bansal2.4K vues
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015 par Monal Daxini
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
Monal Daxini1.2K vues
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022 par HostedbyConfluent
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
DNS Survival Guide par APNIC
DNS Survival GuideDNS Survival Guide
DNS Survival Guide
APNIC403 vues
Experience with Kafka & Storm par Otto Mok
Experience with Kafka & StormExperience with Kafka & Storm
Experience with Kafka & Storm
Otto Mok4.9K vues
Battle Tested Event-Driven Patterns for your Microservices Architecture - Ris... par Natan Silnitsky
Battle Tested Event-Driven Patterns for your Microservices Architecture - Ris...Battle Tested Event-Driven Patterns for your Microservices Architecture - Ris...
Battle Tested Event-Driven Patterns for your Microservices Architecture - Ris...
Natan Silnitsky143 vues
Battle Tested Event-Driven Patterns for your Microservices Architecture par Natan Silnitsky
Battle Tested Event-Driven Patterns for your Microservices ArchitectureBattle Tested Event-Driven Patterns for your Microservices Architecture
Battle Tested Event-Driven Patterns for your Microservices Architecture
Natan Silnitsky170 vues
AWS re:Invent 2016: NextGen Networking: New Capabilities for Amazon’s Virtual... par Amazon Web Services
AWS re:Invent 2016: NextGen Networking: New Capabilities for Amazon’s Virtual...AWS re:Invent 2016: NextGen Networking: New Capabilities for Amazon’s Virtual...
AWS re:Invent 2016: NextGen Networking: New Capabilities for Amazon’s Virtual...
Sharing is Caring: Toward Creating Self-tuning Multi-tenant Kafka (Anna Povzn... par HostedbyConfluent
Sharing is Caring: Toward Creating Self-tuning Multi-tenant Kafka (Anna Povzn...Sharing is Caring: Toward Creating Self-tuning Multi-tenant Kafka (Anna Povzn...
Sharing is Caring: Toward Creating Self-tuning Multi-tenant Kafka (Anna Povzn...
HostedbyConfluent1.4K vues
Summit 16: Achieving Low Latency Network Function with Opnfv par OPNFV
Summit 16: Achieving Low Latency Network Function with OpnfvSummit 16: Achieving Low Latency Network Function with Opnfv
Summit 16: Achieving Low Latency Network Function with Opnfv
OPNFV816 vues
PLNOG 9: Robert Dąbrowski - Carrier-grade NAT (CGN) Solution with FortiGate par PROIDEA
PLNOG 9: Robert Dąbrowski - Carrier-grade NAT (CGN) Solution with FortiGatePLNOG 9: Robert Dąbrowski - Carrier-grade NAT (CGN) Solution with FortiGate
PLNOG 9: Robert Dąbrowski - Carrier-grade NAT (CGN) Solution with FortiGate
PROIDEA243 vues
Integrating OpenStack To Existing Infrastructure par Hui Cheng
Integrating OpenStack To Existing InfrastructureIntegrating OpenStack To Existing Infrastructure
Integrating OpenStack To Existing Infrastructure
Hui Cheng3.7K vues
(BDT318) How Netflix Handles Up To 8 Million Events Per Second par Amazon Web Services
(BDT318) How Netflix Handles Up To 8 Million Events Per Second(BDT318) How Netflix Handles Up To 8 Million Events Per Second
(BDT318) How Netflix Handles Up To 8 Million Events Per Second
Amazon Web Services79.1K vues
Docker Networking in Production at Visa - Sasi Kannappan, Visa and Mark Churc... par Docker, Inc.
Docker Networking in Production at Visa - Sasi Kannappan, Visa and Mark Churc...Docker Networking in Production at Visa - Sasi Kannappan, Visa and Mark Churc...
Docker Networking in Production at Visa - Sasi Kannappan, Visa and Mark Churc...
Docker, Inc.2.7K vues
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning par Guido Schmutz
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & PartitioningApache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Guido Schmutz1.6K vues
Practice of large Hadoop cluster in China Mobile par DataWorks Summit
Practice of large Hadoop cluster in China MobilePractice of large Hadoop cluster in China Mobile
Practice of large Hadoop cluster in China Mobile
DataWorks Summit785 vues
ddsf-student-presentation_756205.pptx par ssuser498be2
ddsf-student-presentation_756205.pptxddsf-student-presentation_756205.pptx
ddsf-student-presentation_756205.pptx
ssuser498be22 vues
FreeSWITCH as a Microservice par Evan McGee
FreeSWITCH as a MicroserviceFreeSWITCH as a Microservice
FreeSWITCH as a Microservice
Evan McGee3.4K vues

Dernier

Ansari: Practical experiences with an LLM-based Islamic Assistant par
Ansari: Practical experiences with an LLM-based Islamic AssistantAnsari: Practical experiences with an LLM-based Islamic Assistant
Ansari: Practical experiences with an LLM-based Islamic AssistantM Waleed Kadous
9 vues29 diapositives
BCIC - Manufacturing Conclave - Technology-Driven Manufacturing for Growth par
BCIC - Manufacturing Conclave -  Technology-Driven Manufacturing for GrowthBCIC - Manufacturing Conclave -  Technology-Driven Manufacturing for Growth
BCIC - Manufacturing Conclave - Technology-Driven Manufacturing for GrowthInnomantra
15 vues4 diapositives
REACTJS.pdf par
REACTJS.pdfREACTJS.pdf
REACTJS.pdfArthyR3
37 vues16 diapositives
Renewal Projects in Seismic Construction par
Renewal Projects in Seismic ConstructionRenewal Projects in Seismic Construction
Renewal Projects in Seismic ConstructionEngineering & Seismic Construction
5 vues8 diapositives
Créativité dans le design mécanique à l’aide de l’optimisation topologique par
Créativité dans le design mécanique à l’aide de l’optimisation topologiqueCréativité dans le design mécanique à l’aide de l’optimisation topologique
Créativité dans le design mécanique à l’aide de l’optimisation topologiqueLIEGE CREATIVE
8 vues84 diapositives
DESIGN OF SPRINGS-UNIT4.pptx par
DESIGN OF SPRINGS-UNIT4.pptxDESIGN OF SPRINGS-UNIT4.pptx
DESIGN OF SPRINGS-UNIT4.pptxgopinathcreddy
21 vues47 diapositives

Dernier(20)

Ansari: Practical experiences with an LLM-based Islamic Assistant par M Waleed Kadous
Ansari: Practical experiences with an LLM-based Islamic AssistantAnsari: Practical experiences with an LLM-based Islamic Assistant
Ansari: Practical experiences with an LLM-based Islamic Assistant
BCIC - Manufacturing Conclave - Technology-Driven Manufacturing for Growth par Innomantra
BCIC - Manufacturing Conclave -  Technology-Driven Manufacturing for GrowthBCIC - Manufacturing Conclave -  Technology-Driven Manufacturing for Growth
BCIC - Manufacturing Conclave - Technology-Driven Manufacturing for Growth
Innomantra 15 vues
REACTJS.pdf par ArthyR3
REACTJS.pdfREACTJS.pdf
REACTJS.pdf
ArthyR337 vues
Créativité dans le design mécanique à l’aide de l’optimisation topologique par LIEGE CREATIVE
Créativité dans le design mécanique à l’aide de l’optimisation topologiqueCréativité dans le design mécanique à l’aide de l’optimisation topologique
Créativité dans le design mécanique à l’aide de l’optimisation topologique
ASSIGNMENTS ON FUZZY LOGIC IN TRAFFIC FLOW.pdf par AlhamduKure
ASSIGNMENTS ON FUZZY LOGIC IN TRAFFIC FLOW.pdfASSIGNMENTS ON FUZZY LOGIC IN TRAFFIC FLOW.pdf
ASSIGNMENTS ON FUZZY LOGIC IN TRAFFIC FLOW.pdf
AlhamduKure8 vues
SUMIT SQL PROJECT SUPERSTORE 1.pptx par Sumit Jadhav
SUMIT SQL PROJECT SUPERSTORE 1.pptxSUMIT SQL PROJECT SUPERSTORE 1.pptx
SUMIT SQL PROJECT SUPERSTORE 1.pptx
Sumit Jadhav 22 vues
Design_Discover_Develop_Campaign.pptx par ShivanshSeth6
Design_Discover_Develop_Campaign.pptxDesign_Discover_Develop_Campaign.pptx
Design_Discover_Develop_Campaign.pptx
ShivanshSeth649 vues
2023Dec ASU Wang NETR Group Research Focus and Facility Overview.pptx par lwang78
2023Dec ASU Wang NETR Group Research Focus and Facility Overview.pptx2023Dec ASU Wang NETR Group Research Focus and Facility Overview.pptx
2023Dec ASU Wang NETR Group Research Focus and Facility Overview.pptx
lwang78180 vues
MongoDB.pdf par ArthyR3
MongoDB.pdfMongoDB.pdf
MongoDB.pdf
ArthyR349 vues
Design of Structures and Foundations for Vibrating Machines, Arya-ONeill-Pinc... par csegroupvn
Design of Structures and Foundations for Vibrating Machines, Arya-ONeill-Pinc...Design of Structures and Foundations for Vibrating Machines, Arya-ONeill-Pinc...
Design of Structures and Foundations for Vibrating Machines, Arya-ONeill-Pinc...
csegroupvn8 vues
Design of machine elements-UNIT 3.pptx par gopinathcreddy
Design of machine elements-UNIT 3.pptxDesign of machine elements-UNIT 3.pptx
Design of machine elements-UNIT 3.pptx
gopinathcreddy37 vues

Resilient Kafka: How DNS Traffic Management and Client Wrappers Ensure Availability