SlideShare a Scribd company logo
1 of 27
KAFKA +
Building the World's Realtime Transit Infrastructure
For Illustration only
SURGE - CIRCA 2013
SURGE - CIRCA 2016
DATA CONSUMERS
Real-time, Fast
Analytics
BATCH PIPELINE
Storm
Applications
Data Science
Analytics
Reporting
KAFKA
VERTICA
RIDER APP
DRIVER APP
API / SERVICES
DISPATCH
(gps logs)
Mapping &
Logistic Ad-hoc exploration
ELK
Samza
Alerts,
Dashboards
Debugging
REAL-TIME PIPELINE
HADOOP
Surge Mobile App
DATA
PRODUCERS
KAFKA 8 ECOSYSTEM @UBER
Product
Features
Predictive
Models
Operational
Analytics
Business
Intelligence
INFRASTRUCTURE ECOSYSTEM
NEAR REALTIME
PRICE SURGING
PRODUCT FEATURES
FRAUD -
ANOMALY
DETECTION
PREDICTIVE MODELS
PREDICTIVE MODELS
ETA
OPERATIONAL ANALYTICS
UberEATs
OPERATIONAL ANALYTICS
XP
OPERATIONAL ANALYTICS
BUSINESS INTELLIGENCE
KAFKA 8KAFKA 7 MIGRATOR
Limited Availability
Difficult to Scale
Not multi-DC Multi-lang incompatibility Multi-DC, multi-language
support
2013 2014 2015 - 2016
KAFKA 7 WORLD
Difficult to Operate
Producer Scale Issues
High Availability
High Scalability
Kafka 7 + Mirrormaker
Deployed everywhere
Kafka 7 migrator
Deployed everywhere
New Kafka 8
pipeline
Kafka 7
Mirrormaker
2.0
Rest
architecture
Data AuditAutomated
Topic Mgmt
Logs Business events
Async REST library
Data Audit
Local spooling
High throughput
custom protocol
REST ARCHITECTURE
Rest Proxy
Automated Schema and Topic Management
Mirrormaker 2.0
Robust
Data Audit
Dynamic topics
MIRROR MAKER 2.0
Destination DCSource DC
Msg counts across multiple DCs
End-end latencies across multiple
DCs
DATA AUDIT FOR KAFKA MESSAGES
Mirrormaker
2.0
Rest
architecture
Data Audit Kafka 8Automated
Topic Mgmt
A ROBUST FUTURE
0 data loss messaging system
Data discovery and lineage
Quota management
Self-correcting brokers
Active active data pipelines
Real-time Data
Dynamic SQL(ish)
Real-time decision
THE FUTURE
Real-time Data
Custom Application
Real-time decision
THE PRESENT
TELEMATICS
SELF DRIVING CAR
Thank you, Kafka Community!

More Related Content

What's hot

What's hot (20)

Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
 
Databricks Platform.pptx
Databricks Platform.pptxDatabricks Platform.pptx
Databricks Platform.pptx
 
Apache Iceberg Presentation for the St. Louis Big Data IDEA
Apache Iceberg Presentation for the St. Louis Big Data IDEAApache Iceberg Presentation for the St. Louis Big Data IDEA
Apache Iceberg Presentation for the St. Louis Big Data IDEA
 
Apache Kafka® and API Management
Apache Kafka® and API ManagementApache Kafka® and API Management
Apache Kafka® and API Management
 
ksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database SystemksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database System
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and Hudi
 
Getting Started with Delta Lake on Databricks
Getting Started with Delta Lake on DatabricksGetting Started with Delta Lake on Databricks
Getting Started with Delta Lake on Databricks
 
Netflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaNetflix Data Pipeline With Kafka
Netflix Data Pipeline With Kafka
 
Making Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeMaking Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta Lake
 
Envoy and Kafka
Envoy and KafkaEnvoy and Kafka
Envoy and Kafka
 
Real-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache PinotReal-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache Pinot
 
Iceberg: a fast table format for S3
Iceberg: a fast table format for S3Iceberg: a fast table format for S3
Iceberg: a fast table format for S3
 
Streaming all over the world Real life use cases with Kafka Streams
Streaming all over the world  Real life use cases with Kafka StreamsStreaming all over the world  Real life use cases with Kafka Streams
Streaming all over the world Real life use cases with Kafka Streams
 
Real-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiReal-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFi
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using Kafka
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
 
Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...
 
Kafka presentation
Kafka presentationKafka presentation
Kafka presentation
 
Azure storage
Azure storageAzure storage
Azure storage
 

Viewers also liked

Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
DataStax
 

Viewers also liked (10)

Uber's new mobile architecture
Uber's new mobile architectureUber's new mobile architecture
Uber's new mobile architecture
 
Building Real-Time Applications with Android and WebSockets
Building Real-Time Applications with Android and WebSocketsBuilding Real-Time Applications with Android and WebSockets
Building Real-Time Applications with Android and WebSockets
 
"Building Data Foundations and Analytics Tools Across The Product" by Crystal...
"Building Data Foundations and Analytics Tools Across The Product" by Crystal..."Building Data Foundations and Analytics Tools Across The Product" by Crystal...
"Building Data Foundations and Analytics Tools Across The Product" by Crystal...
 
Open-source Infrastructure at Lyft
Open-source Infrastructure at LyftOpen-source Infrastructure at Lyft
Open-source Infrastructure at Lyft
 
Taxi Startup Presentation for Taxi Company
Taxi Startup Presentation for Taxi CompanyTaxi Startup Presentation for Taxi Company
Taxi Startup Presentation for Taxi Company
 
Just Add Reality: Managing Logistics with the Uber Developer Platform
Just Add Reality: Managing Logistics with the Uber Developer PlatformJust Add Reality: Managing Logistics with the Uber Developer Platform
Just Add Reality: Managing Logistics with the Uber Developer Platform
 
Geospatial Indexing at Scale: The 15 Million QPS Redis Architecture Powering ...
Geospatial Indexing at Scale: The 15 Million QPS Redis Architecture Powering ...Geospatial Indexing at Scale: The 15 Million QPS Redis Architecture Powering ...
Geospatial Indexing at Scale: The 15 Million QPS Redis Architecture Powering ...
 
31 - IDNOG03 - Bergas Bimo Branarto (GOJEK) - Scaling Gojek
31 - IDNOG03 - Bergas Bimo Branarto (GOJEK) - Scaling Gojek31 - IDNOG03 - Bergas Bimo Branarto (GOJEK) - Scaling Gojek
31 - IDNOG03 - Bergas Bimo Branarto (GOJEK) - Scaling Gojek
 
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
 
Stream Processing with Kafka in Uber, Danny Yuan
Stream Processing with Kafka in Uber, Danny Yuan Stream Processing with Kafka in Uber, Danny Yuan
Stream Processing with Kafka in Uber, Danny Yuan
 

Similar to Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout

Similar to Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout (20)

SNCF-Reseau-6th GIS Rail Summitv3
SNCF-Reseau-6th GIS Rail Summitv3SNCF-Reseau-6th GIS Rail Summitv3
SNCF-Reseau-6th GIS Rail Summitv3
 
Nordstrom's Event-Sourced Architecture and Kafka-as-a-Service | Adam Weyant a...
Nordstrom's Event-Sourced Architecture and Kafka-as-a-Service | Adam Weyant a...Nordstrom's Event-Sourced Architecture and Kafka-as-a-Service | Adam Weyant a...
Nordstrom's Event-Sourced Architecture and Kafka-as-a-Service | Adam Weyant a...
 
PLNOG 22 - Frédéric Guillois - Automatyzacja widoczności – dynamiczne podejś...
PLNOG 22 -  Frédéric Guillois - Automatyzacja widoczności – dynamiczne podejś...PLNOG 22 -  Frédéric Guillois - Automatyzacja widoczności – dynamiczne podejś...
PLNOG 22 - Frédéric Guillois - Automatyzacja widoczności – dynamiczne podejś...
 
Acura embedded systems on fire policeemergency
Acura embedded systems on fire policeemergencyAcura embedded systems on fire policeemergency
Acura embedded systems on fire policeemergency
 
AI and Space: finally, no more arguing with the GPS
AI and Space: finally, no more arguing with the GPSAI and Space: finally, no more arguing with the GPS
AI and Space: finally, no more arguing with the GPS
 
Webinar: Introducing the SnapLogic Elastic Integration Platform Summer 2014 R...
Webinar: Introducing the SnapLogic Elastic Integration Platform Summer 2014 R...Webinar: Introducing the SnapLogic Elastic Integration Platform Summer 2014 R...
Webinar: Introducing the SnapLogic Elastic Integration Platform Summer 2014 R...
 
EDA Meets Data Engineering – What's the Big Deal?
EDA Meets Data Engineering – What's the Big Deal?EDA Meets Data Engineering – What's the Big Deal?
EDA Meets Data Engineering – What's the Big Deal?
 
IXIA VISIBILITY ARCHITECTURE Eliminating Blind spots
IXIA VISIBILITY ARCHITECTURE Eliminating Blind spotsIXIA VISIBILITY ARCHITECTURE Eliminating Blind spots
IXIA VISIBILITY ARCHITECTURE Eliminating Blind spots
 
3° Fiware Overview-Chile- Track
3° Fiware Overview-Chile- Track3° Fiware Overview-Chile- Track
3° Fiware Overview-Chile- Track
 
Google maps platform product pitch deck
Google maps platform   product pitch deck Google maps platform   product pitch deck
Google maps platform product pitch deck
 
2022.04.06 cam scripter
2022.04.06 cam scripter2022.04.06 cam scripter
2022.04.06 cam scripter
 
100%-ный контроль для 100%-ной безопасности
100%-ный контроль для 100%-ной безопасности100%-ный контроль для 100%-ной безопасности
100%-ный контроль для 100%-ной безопасности
 
Fast Cars, Big Data - How Streaming Can Help Formula 1
Fast Cars, Big Data - How Streaming Can Help Formula 1Fast Cars, Big Data - How Streaming Can Help Formula 1
Fast Cars, Big Data - How Streaming Can Help Formula 1
 
Apache Kafka for Smart Grid, Utilities and Energy Production
Apache Kafka for Smart Grid, Utilities and Energy ProductionApache Kafka for Smart Grid, Utilities and Energy Production
Apache Kafka for Smart Grid, Utilities and Energy Production
 
Edge2AI delivered by Cloudera Edge Management(CEM) 
Edge2AI delivered by Cloudera Edge Management(CEM) Edge2AI delivered by Cloudera Edge Management(CEM) 
Edge2AI delivered by Cloudera Edge Management(CEM) 
 
Go for Real Time Streaming Architectures - DotGo 2017
Go for Real Time Streaming Architectures - DotGo 2017Go for Real Time Streaming Architectures - DotGo 2017
Go for Real Time Streaming Architectures - DotGo 2017
 
Flink Forward SF 2017: Chinmay Soman - Real Time Analytics in the real World ...
Flink Forward SF 2017: Chinmay Soman - Real Time Analytics in the real World ...Flink Forward SF 2017: Chinmay Soman - Real Time Analytics in the real World ...
Flink Forward SF 2017: Chinmay Soman - Real Time Analytics in the real World ...
 
Building scalable data with kafka and spark
Building scalable data with kafka and sparkBuilding scalable data with kafka and spark
Building scalable data with kafka and spark
 
EXA8 Aggregation & Capture Application
EXA8 Aggregation & Capture ApplicationEXA8 Aggregation & Capture Application
EXA8 Aggregation & Capture Application
 
Web Liquid Streams Mashup Challenge ICWE 2015
Web Liquid Streams Mashup Challenge ICWE 2015Web Liquid Streams Mashup Challenge ICWE 2015
Web Liquid Streams Mashup Challenge ICWE 2015
 

More from confluent

More from confluent (20)

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Santander Stream Processing with Apache Flink
Santander Stream Processing with Apache FlinkSantander Stream Processing with Apache Flink
Santander Stream Processing with Apache Flink
 
Unlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsUnlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insights
 
Workshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con FlinkWorkshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con Flink
 
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
 
AWS Immersion Day Mapfre - Confluent
AWS Immersion Day Mapfre   -   ConfluentAWS Immersion Day Mapfre   -   Confluent
AWS Immersion Day Mapfre - Confluent
 
Eventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkEventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalk
 
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent CloudQ&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
 
Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Dive
 
Build real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with ConfluentBuild real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with Confluent
 
Q&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service MeshQ&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service Mesh
 
Citi Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka MicroservicesCiti Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka Microservices
 
Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3
 
Citi Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging ModernizationCiti Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging Modernization
 
Citi Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataCiti Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time data
 
Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2
 
Data In Motion Paris 2023
Data In Motion Paris 2023Data In Motion Paris 2023
Data In Motion Paris 2023
 
Confluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with SynthesisConfluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with Synthesis
 
The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023
 
The Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data StreamsThe Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data Streams
 

Recently uploaded

VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
dollysharma2066
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
MsecMca
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
Neometrix_Engineering_Pvt_Ltd
 

Recently uploaded (20)

VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS Lambda
 
22-prompt engineering noted slide shown.pdf
22-prompt engineering noted slide shown.pdf22-prompt engineering noted slide shown.pdf
22-prompt engineering noted slide shown.pdf
 
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsFEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.
 
Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdf
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 

Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout

Editor's Notes

  1. Duration: Keynote is 15 mins long Good morning! My name is Aaron Schildkrout. I run Data and Marketing at Uber. I’m here today to talk to you about our Realtime journey at Uber - and particularly the critical and hugely empowering role Kafka (including Confluent and the whole Kafka community) has played in this journey.
  2. Uber is realtime transit infrastructure for the globe. We’ve stated many times that we want this infrastructure to be as reliable as running water. A utility. A right even. A project that started out as a cool app to get you black cars on demand - is quickly becoming among the largest global infrastructure inventions of all time. And - like the cars moving on the streets outside right now - it is all taking place now and now and now. It is real time.
  3. We’re not the only ones. The internet is quite literally penetrating our lives. -our cities -our relationships -our bodies This is a known story. But it’s getting more radical by the day. And as this penetration increases - in volume, in immediacy, in depth - there is an unbelievable increase in the need for systems that facilitate the flow of information, in real time, between our lives and our machines and back again. That’s why we’re all here.
  4. Compressing time and space - is...a non-trivial technical problem. Uber for instance - has always sought to provide this kind of truly responsive, realtime infrastructure. But in the beginning we were...just starting. This is surge circa 2012/3 in our driver app. Our first version of surge, v1, used data it queried directly from our dispatch service There was only one Node.js process per city The geofenses were very big and not granular at all (causing a lot of problems and huge inefficiency).
  5. This is surge today - with the addition of much more granular geo-temporal surge targeting. We are updating - in real-time - our understanding of supply and demand in highly specific geographies to allow us to calculate surge in the hexagons shown in this screen. This system now runs on Kafka - as opposed to our janky node query - and while it took us a bit of time to make this truly work at our exponentially exploding global scale...we’ve gotten...at least closer. That’s the story I’ll tell today.
  6. To get the obvious architectural diagram out of the way - here’s how Kafka 8 is currently used @ Uber.
  7. The Real-time infrastructure ecosystem - which includes Kafka - at Uber powers many key pieces of our business. I think of this in this topology...
  8. Surge - as noted earlier..
  9. FRAUD MODELS
  10. ETA - real-time system
  11. Cities use real-time operational analytics to active manage their cities - making adjustments in dispatch, messaging, etc - to optimize city functioning. Much of Uber’s success has to do with the amazing speed and agility of our on-the-ground global city teams - and much of this comes from empowering them with realtime tools.
  12. We’ve recently applied this same type of infrastructure to our Uber Eats business, which is rapidly scaling now and involves significant operational complexity.
  13. Internally analytics on our experimentation pipeline - which now powers the creation of hundreds of new experiments weekly and on which our teams are acting on daily based on rapid data feedback loops - is a real-time system.
  14. Pretty awesome. But it took a long journey to get there. 2013 - we first launched Kafka 7 each application essentially ran its own Kafka cluster 2014 - started a transition to K8 - where we started moving all our K7 data to K8 through the K7 migrator. 2015 to today - we deployed a fully functional K8 pipeline - stable with scalable producers and consumers and multi-DC, multi language support
  15. Along the way we ran into some significant limitations…and we did a bunch of work that I’ll work through now to complete our migration to Kafka 8 - and, more fundamentally, to make Kafka work at our scale.
  16. We implemented REST proxy improvements, adding a new binary protocol for high throughput. By building REST client libraries, we facilitated multi-language support (which was important given our 4-language environment)
  17. We automated schema and topic management. In a world with many thousands of topics and hundreds of engineers and teams producing data, the absence of strong tools around schema inferencing, enforcement and management were a huge painpoint.
  18. We built Mirrormaker 2.0, which we’ll soon be open sourcing… It’s More robust // Easier to operate // and allows for dynamic topic addition
  19. And… We built a series of Data auditing tools - allowing us to track data loss and latency spikes at different points in the Kafka pipeline, which at scale became critical for triaging and solving problems at a rapid pace
  20. All kafka data producers at Uber are now running Kafka 8. The project has been a huge success and is now powering much of Uber’s data infrastructure. It is...mission critical.
  21. Add notes
  22. The goal is to shrink the barrier between real time Infra and analytical usage.
  23. We’re currently capturing accelerometer data from the driver’s / rider’s phone via Kafka. This data is then used for: Detecting traffic / road conditions ? (need to confirm) 1) we use our motionstash data to generate safety models an safety scores for all our drivers (Supervised machine learning and classification algorithms) 2) we do per trip adhoc- analysis for safety by computing safety scores per driver. Use the models generated in 1) to predict in realtime and alert a driver about their unsafe driving.
  24. Duration: Keynote is 15 mins long