SlideShare a Scribd company logo
1 of 90
Scaling Uber’s Elasticsearch as an Geo-Temporal Database
Danny Yuan @ Uber
InfoQ.com: News & Community Site
Watch the video with slide
synchronization on InfoQ.com!
https://www.infoq.com/presentations/
uber-elasticsearch-clusters
• Over 1,000,000 software developers, architects and CTOs read the site world-
wide every month
• 250,000 senior developers subscribe to our weekly newsletter
• Published in 4 languages (English, Chinese, Japanese and Brazilian
Portuguese)
• Post content from our QCon conferences
• 2 dedicated podcast channels: The InfoQ Podcast, with a focus on
Architecture and The Engineering Culture Podcast, with a focus on building
• 96 deep dives on innovative topics packed as downloadable emags and
minibooks
• Over 40 new content items per week
Purpose of QCon
- to empower software development by facilitating the spread of
knowledge and innovation
Strategy
- practitioner-driven conference designed for YOU: influencers of
change and innovation in your teams
- speakers and topics driving the evolution and innovation
- connecting and catalyzing the influencers and innovators
Highlights
- attended by more than 12,000 delegates since 2007
- held in 9 cities worldwide
Presented at QCon London
www.qconlondon.com
Use Cases for a Geo-Temporal Database
Real-time Decisions on Global Scale
Dynamic Pricing: Every Hexagon, Every Minute
Dynamic Pricing: Every Hexagon, Every Minute
Metrics: how many UberXs were in a trip in the past 10 minutes
Metrics: how many UberXs were in a trip in the past 10 minutes
Market Analysis: Travel Times
Forecasting: Granular Forecasting of Rider Demand
How Can We Produce Geo-Temporal Data for Ever Changing Business Needs?
Key Question: What Is the Right Abstraction?
Abstraction: Single-Table OLAP on Geo-Temporal Data
Abstraction: Single-Table OLAP on Geo-Temporal Data
SELECT			<agg	functions>,	<dimensions>	

FROM					<data_source>

WHERE				<boolean	filter>

GROUP	BY	<dimensions>

HAVING			<boolean	filter>

ORDER	BY	<sorting	criterial>

LIMIT				<n>

Abstraction: Single-Table OLAP on Geo-Temporal Data
SELECT			<agg	functions>,	<dimensions>	

FROM					<data_source>

WHERE				<boolean	filter>

GROUP	BY	<dimensions>

HAVING			<boolean	filter>

ORDER	BY	<sorting	criterial>

LIMIT				<n>

Why Elasticsearch?
- Arbitrary boolean query
- Sub-second response time
- Built-in distributed aggregation functions
- High-cardinality queries
- Idempotent insertion to deduplicate data
- Second-level data freshness
- Scales with data volume
- Operable by small team
Current Scale: An Important Context
- Ingestion: 850K to 1.3M messages/second
- Ingestion volume: 12TB / day
- Doc scans: 100M to 4B docs/ second
- Data size: 1 PB
- Cluster size: 700 ElasticSearch Machines
- Ingestion pipeline: 100+ Data Pipeline Jobs
Our Story of Scaling Elasticsearch
Three Dimensions of Scale
Ingestion
Query
Operation
Driving Principles
- Optimize for fast iteration
- Optimize for simple operations
- Optimize for automation and tools
- Optimize for being reasonably fast
The Past: We Started Small
Constraints for Being Small
- Three-person team
- Two data centers
- Small set of requirements: common analytics for machines
First Order of Business: Take Care of the Basics
Get Single-Node Right: Follow the 20-80 Rule
- One table <—> multiple indices by time range
- Disable _source field
- Disable _all field
- Use doc_values for storage
- Disable analyzed field
- Tune JVM parameters
Make Decisions with Numbers
- What’s the maximum number of recovery threads?
- What’s the maximum size of request queue?
- What should the refresh rate be?
- How many shards should an index have?
- What’s the throttling threshold?
- Solution: Set up end-to-end stress testing framework
Deployment in Two Data Centers
- Each data center has exclusive set of cities
- Should tolerate failure of a single data center
- Ingestion should continue to work
- Querying any city should return correct results
Deployment in Two Data Centers: trade space for availability
Deployment in Two Data Centers: trade space for availability
Deployment in Two Data Centers: trade space for availability
Discretize Geo Locations: H3
Optimizations to Ingestion
Optimizations to Ingestion
Dealing with Large Volume of Data
- An event source produces more than 3TB every day
- Key insight: human does not need too granular data
- Key insight: stream data usually has lots of redundancy
- Pruning unnecessary fields
- Devise algorithms to remove redundancy
- 3TB —> 42 GB, more than 70x of reduction!
- Bulk write
Dealing with Large Volume of Data
Data Modeling Matters
Example: Efficient and Reliable Join
- Example: Calculate Completed/Requested ratio with two different event streams
Example: Efficient and Reliable Join: Use Elasticsearch
- Calculate Completed/Requested ratio from two Kafka topics
- Can we use streaming join?
- Can we join on the query side?
- Solution: rendezvous at Elasticsearch on trip ID
TripID Pickup Time Completed
1 2018-02-03T… TRUE
2 2018-02-3T… FALSE
Example: aggregation on state transitions
Optimize Querying Elasticsearch
Hide Query Optimization from Users
- Do we really expect every user to write Elasticsearch queries?
- What if someone issues a very expensive query?
- Solution: Isolation with a query layer
Query Layer with Multiple Clusters
Query Layer with Multiple Clusters
Query Layer with Multiple Clusters
- Generate efficient Elasticsearch queries
- Rejecting expensive queries
- Routing queries - hardcoded first
Efficient Query Generation
- “GROUP BY a, b”
Rejecting Expensive Queries
- 10,000 hexagons / city x 1440 minutes per day x 800 cities
- Cardinality: 11 Billion (!) buckets —> Out Of Memory Error
Routing Queries
"DEMAND": {
"CLUSTERS": {
"TIER0": {
"CLUSTERS": ["ES_CLUSTER_TIER0"],
},
"TIER2": {
"CLUSTERS": ["ES_CLUSTER_TIER2"]
}
},
"INDEX": "MARKETPLACE_DEMAND-",
"SUFFIXFORMAT": “YYYYMM.WW",
"ROUTING": “PRODUCT_ID”,
}
Routing Queries
"DEMAND": {
"CLUSTERS": {
"TIER0": {
"CLUSTERS": ["ES_CLUSTER_TIER0"],
},
"TIER2": {
"CLUSTERS": ["ES_CLUSTER_TIER2"]
}
},
"INDEX": "MARKETPLACE_DEMAND-",
"SUFFIXFORMAT": “YYYYMM.WW",
"ROUTING": “PRODUCT_ID”,
}
Routing Queries
"DEMAND": {
"CLUSTERS": {
"TIER0": {
"CLUSTERS": ["ES_CLUSTER_TIER0"],
},
"TIER2": {
"CLUSTERS": ["ES_CLUSTER_TIER2"]
}
},
"INDEX": "MARKETPLACE_DEMAND-",
"SUFFIXFORMAT": “YYYYMM.WW",
"ROUTING": “PRODUCT_ID”,
}
Routing Queries
"DEMAND": {
"CLUSTERS": {
"TIER0": {
"CLUSTERS": ["ES_CLUSTER_TIER0"],
},
"TIER2": {
"CLUSTERS": ["ES_CLUSTER_TIER2"]
}
},
"INDEX": "MARKETPLACE_DEMAND-",
"SUFFIXFORMAT": “YYYYMM.WW",
"ROUTING": “PRODUCT_ID”,
}
Summary of First Iteration
Evolution: Success Breeds Failures
Unexpected Surges
Applications Went Haywire
Solution: Distributed Rate limiting
Solution: Distributed Rate limiting
Per-Cluster Rate Limit
Solution: Distributed Rate limiting
Per-Instance Rate Limit
Workload Evolved
- Users query months of data for modeling and complex analytics
- Key insight: Data can be a little stale for long-range queries
- Solution: Caching layer and delayed execution
Time Series Cache
Time Series Cache
- Redis as the cache store
- Cache key is based on normalized query content and time range
Time Series Cache
- Redis as the cache store
- Cache key is based on normalized query content and time range
Time Series Cache
- Redis as the cache store
- Cache key is based on normalized query content and time range
Time Series Cache
- Redis as the cache store
- Cache key is based on normalized query content and time range
Time Series Cache
- Redis as the cache store
- Cache key is based on normalized query content and time range
Time Series Cache
- Redis as the cache store
- Cache key is based on normalized query content and time range
Delayed Execution
- Allow registering long-running queries
- Provide cached but stale data for such queries
- Dedicated cluster and queued executions
- Rationale: three months of data vs a few hours of staleness
- Example: [-30d, 0d] —> [-30d, -1d]
Scale Operations
- Make the system transparent
- Optimize for MTTR - mean time to recover
- Strive for consistency
- Automation is the most effective way to get consistency
Driving Principles
- Cluster slowed down with all metrics being normal
- Requires additional instrumentation
- ES Plugin as a solution
Challenge: Diagnosis
- Elasticsearch cluster becomes harder to operate as its size increases
- MTTR increases as cluster size increases
- Multi-tenancy becomes a huge issue
- Can’t have too many shards
Challenge: Cluster Size Becomes an Enemy
- 3 clusters —> many smaller clusters
- Dynamic routing
- Meta-data driven
Federation
Federation
Federation
Federation
Federation
Federation
Federation
How Can We Trust the Data?
Self-Serving Trust System
Self-Serving Trust System
Self-Serving Trust System
Self-Serving Trust System
Too Much Manual Maintenance Work
- Adjusting queue size
- Restart machines
- Relocating shards
Too Much Manual Maintenance Work
Auto Ops
Auto Ops
Ongoing Work for the Future
- Strong reliability
- Strong consistency among replicas
- Multi-tenancy
Future Work
Summary
- Three dimensions of scaling: ingestion, query, and operations
- Be simple and practical: successful systems emerge from simple ones
- Abstraction and data modeling matter
- Invest in thorough instrumentation
- Invest in automation and tools
Watch the video with slide
synchronization on InfoQ.com!
https://www.infoq.com/presentations/
uber-elasticsearch-clusters

More Related Content

More from C4Media

Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDC4Media
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine LearningC4Media
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at SpeedC4Media
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsC4Media
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsC4Media
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerC4Media
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleC4Media
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeC4Media
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereC4Media
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing ForC4Media
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data EngineeringC4Media
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreC4Media
 
Navigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery TeamsNavigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery TeamsC4Media
 
High Performance Cooperative Distributed Systems in Adtech
High Performance Cooperative Distributed Systems in AdtechHigh Performance Cooperative Distributed Systems in Adtech
High Performance Cooperative Distributed Systems in AdtechC4Media
 
Rust's Journey to Async/await
Rust's Journey to Async/awaitRust's Journey to Async/await
Rust's Journey to Async/awaitC4Media
 
Opportunities and Pitfalls of Event-Driven Utopia
Opportunities and Pitfalls of Event-Driven UtopiaOpportunities and Pitfalls of Event-Driven Utopia
Opportunities and Pitfalls of Event-Driven UtopiaC4Media
 
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayDatadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayC4Media
 
Are We Really Cloud-Native?
Are We Really Cloud-Native?Are We Really Cloud-Native?
Are We Really Cloud-Native?C4Media
 
CockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL DatabaseCockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL DatabaseC4Media
 
A Dive into Streams @LinkedIn with Brooklin
A Dive into Streams @LinkedIn with BrooklinA Dive into Streams @LinkedIn with Brooklin
A Dive into Streams @LinkedIn with BrooklinC4Media
 

More from C4Media (20)

Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CD
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine Learning
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at Speed
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep Systems
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.js
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly Compiler
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix Scale
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's Edge
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home Everywhere
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing For
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data Engineering
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
 
Navigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery TeamsNavigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery Teams
 
High Performance Cooperative Distributed Systems in Adtech
High Performance Cooperative Distributed Systems in AdtechHigh Performance Cooperative Distributed Systems in Adtech
High Performance Cooperative Distributed Systems in Adtech
 
Rust's Journey to Async/await
Rust's Journey to Async/awaitRust's Journey to Async/await
Rust's Journey to Async/await
 
Opportunities and Pitfalls of Event-Driven Utopia
Opportunities and Pitfalls of Event-Driven UtopiaOpportunities and Pitfalls of Event-Driven Utopia
Opportunities and Pitfalls of Event-Driven Utopia
 
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayDatadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
 
Are We Really Cloud-Native?
Are We Really Cloud-Native?Are We Really Cloud-Native?
Are We Really Cloud-Native?
 
CockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL DatabaseCockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL Database
 
A Dive into Streams @LinkedIn with Brooklin
A Dive into Streams @LinkedIn with BrooklinA Dive into Streams @LinkedIn with Brooklin
A Dive into Streams @LinkedIn with Brooklin
 

Recently uploaded

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 

Recently uploaded (20)

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 

Scaling Uber's Elasticsearch Clusters

  • 1. Scaling Uber’s Elasticsearch as an Geo-Temporal Database Danny Yuan @ Uber
  • 2. InfoQ.com: News & Community Site Watch the video with slide synchronization on InfoQ.com! https://www.infoq.com/presentations/ uber-elasticsearch-clusters • Over 1,000,000 software developers, architects and CTOs read the site world- wide every month • 250,000 senior developers subscribe to our weekly newsletter • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • 2 dedicated podcast channels: The InfoQ Podcast, with a focus on Architecture and The Engineering Culture Podcast, with a focus on building • 96 deep dives on innovative topics packed as downloadable emags and minibooks • Over 40 new content items per week
  • 3. Purpose of QCon - to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide Presented at QCon London www.qconlondon.com
  • 4. Use Cases for a Geo-Temporal Database
  • 5. Real-time Decisions on Global Scale
  • 6. Dynamic Pricing: Every Hexagon, Every Minute
  • 7. Dynamic Pricing: Every Hexagon, Every Minute
  • 8. Metrics: how many UberXs were in a trip in the past 10 minutes
  • 9. Metrics: how many UberXs were in a trip in the past 10 minutes
  • 12. How Can We Produce Geo-Temporal Data for Ever Changing Business Needs?
  • 13. Key Question: What Is the Right Abstraction?
  • 14. Abstraction: Single-Table OLAP on Geo-Temporal Data
  • 15. Abstraction: Single-Table OLAP on Geo-Temporal Data SELECT <agg functions>, <dimensions> 
 FROM <data_source>
 WHERE <boolean filter>
 GROUP BY <dimensions>
 HAVING <boolean filter>
 ORDER BY <sorting criterial>
 LIMIT <n>

  • 16. Abstraction: Single-Table OLAP on Geo-Temporal Data SELECT <agg functions>, <dimensions> 
 FROM <data_source>
 WHERE <boolean filter>
 GROUP BY <dimensions>
 HAVING <boolean filter>
 ORDER BY <sorting criterial>
 LIMIT <n>

  • 17. Why Elasticsearch? - Arbitrary boolean query - Sub-second response time - Built-in distributed aggregation functions - High-cardinality queries - Idempotent insertion to deduplicate data - Second-level data freshness - Scales with data volume - Operable by small team
  • 18. Current Scale: An Important Context - Ingestion: 850K to 1.3M messages/second - Ingestion volume: 12TB / day - Doc scans: 100M to 4B docs/ second - Data size: 1 PB - Cluster size: 700 ElasticSearch Machines - Ingestion pipeline: 100+ Data Pipeline Jobs
  • 19. Our Story of Scaling Elasticsearch
  • 20. Three Dimensions of Scale Ingestion Query Operation
  • 21. Driving Principles - Optimize for fast iteration - Optimize for simple operations - Optimize for automation and tools - Optimize for being reasonably fast
  • 22. The Past: We Started Small
  • 23. Constraints for Being Small - Three-person team - Two data centers - Small set of requirements: common analytics for machines
  • 24. First Order of Business: Take Care of the Basics
  • 25. Get Single-Node Right: Follow the 20-80 Rule - One table <—> multiple indices by time range - Disable _source field - Disable _all field - Use doc_values for storage - Disable analyzed field - Tune JVM parameters
  • 26. Make Decisions with Numbers - What’s the maximum number of recovery threads? - What’s the maximum size of request queue? - What should the refresh rate be? - How many shards should an index have? - What’s the throttling threshold? - Solution: Set up end-to-end stress testing framework
  • 27. Deployment in Two Data Centers - Each data center has exclusive set of cities - Should tolerate failure of a single data center - Ingestion should continue to work - Querying any city should return correct results
  • 28. Deployment in Two Data Centers: trade space for availability
  • 29. Deployment in Two Data Centers: trade space for availability
  • 30. Deployment in Two Data Centers: trade space for availability
  • 34. Dealing with Large Volume of Data - An event source produces more than 3TB every day - Key insight: human does not need too granular data - Key insight: stream data usually has lots of redundancy
  • 35. - Pruning unnecessary fields - Devise algorithms to remove redundancy - 3TB —> 42 GB, more than 70x of reduction! - Bulk write Dealing with Large Volume of Data
  • 37. Example: Efficient and Reliable Join - Example: Calculate Completed/Requested ratio with two different event streams
  • 38. Example: Efficient and Reliable Join: Use Elasticsearch - Calculate Completed/Requested ratio from two Kafka topics - Can we use streaming join? - Can we join on the query side? - Solution: rendezvous at Elasticsearch on trip ID TripID Pickup Time Completed 1 2018-02-03T… TRUE 2 2018-02-3T… FALSE
  • 39. Example: aggregation on state transitions
  • 41. Hide Query Optimization from Users - Do we really expect every user to write Elasticsearch queries? - What if someone issues a very expensive query? - Solution: Isolation with a query layer
  • 42. Query Layer with Multiple Clusters
  • 43. Query Layer with Multiple Clusters
  • 44. Query Layer with Multiple Clusters - Generate efficient Elasticsearch queries - Rejecting expensive queries - Routing queries - hardcoded first
  • 45. Efficient Query Generation - “GROUP BY a, b”
  • 46. Rejecting Expensive Queries - 10,000 hexagons / city x 1440 minutes per day x 800 cities - Cardinality: 11 Billion (!) buckets —> Out Of Memory Error
  • 47. Routing Queries "DEMAND": { "CLUSTERS": { "TIER0": { "CLUSTERS": ["ES_CLUSTER_TIER0"], }, "TIER2": { "CLUSTERS": ["ES_CLUSTER_TIER2"] } }, "INDEX": "MARKETPLACE_DEMAND-", "SUFFIXFORMAT": “YYYYMM.WW", "ROUTING": “PRODUCT_ID”, }
  • 48. Routing Queries "DEMAND": { "CLUSTERS": { "TIER0": { "CLUSTERS": ["ES_CLUSTER_TIER0"], }, "TIER2": { "CLUSTERS": ["ES_CLUSTER_TIER2"] } }, "INDEX": "MARKETPLACE_DEMAND-", "SUFFIXFORMAT": “YYYYMM.WW", "ROUTING": “PRODUCT_ID”, }
  • 49. Routing Queries "DEMAND": { "CLUSTERS": { "TIER0": { "CLUSTERS": ["ES_CLUSTER_TIER0"], }, "TIER2": { "CLUSTERS": ["ES_CLUSTER_TIER2"] } }, "INDEX": "MARKETPLACE_DEMAND-", "SUFFIXFORMAT": “YYYYMM.WW", "ROUTING": “PRODUCT_ID”, }
  • 50. Routing Queries "DEMAND": { "CLUSTERS": { "TIER0": { "CLUSTERS": ["ES_CLUSTER_TIER0"], }, "TIER2": { "CLUSTERS": ["ES_CLUSTER_TIER2"] } }, "INDEX": "MARKETPLACE_DEMAND-", "SUFFIXFORMAT": “YYYYMM.WW", "ROUTING": “PRODUCT_ID”, }
  • 51. Summary of First Iteration
  • 56. Solution: Distributed Rate limiting Per-Cluster Rate Limit
  • 57. Solution: Distributed Rate limiting Per-Instance Rate Limit
  • 58. Workload Evolved - Users query months of data for modeling and complex analytics - Key insight: Data can be a little stale for long-range queries - Solution: Caching layer and delayed execution
  • 60. Time Series Cache - Redis as the cache store - Cache key is based on normalized query content and time range
  • 61. Time Series Cache - Redis as the cache store - Cache key is based on normalized query content and time range
  • 62. Time Series Cache - Redis as the cache store - Cache key is based on normalized query content and time range
  • 63. Time Series Cache - Redis as the cache store - Cache key is based on normalized query content and time range
  • 64. Time Series Cache - Redis as the cache store - Cache key is based on normalized query content and time range
  • 65. Time Series Cache - Redis as the cache store - Cache key is based on normalized query content and time range
  • 66. Delayed Execution - Allow registering long-running queries - Provide cached but stale data for such queries - Dedicated cluster and queued executions - Rationale: three months of data vs a few hours of staleness - Example: [-30d, 0d] —> [-30d, -1d]
  • 68. - Make the system transparent - Optimize for MTTR - mean time to recover - Strive for consistency - Automation is the most effective way to get consistency Driving Principles
  • 69. - Cluster slowed down with all metrics being normal - Requires additional instrumentation - ES Plugin as a solution Challenge: Diagnosis
  • 70. - Elasticsearch cluster becomes harder to operate as its size increases - MTTR increases as cluster size increases - Multi-tenancy becomes a huge issue - Can’t have too many shards Challenge: Cluster Size Becomes an Enemy
  • 71. - 3 clusters —> many smaller clusters - Dynamic routing - Meta-data driven Federation
  • 78. How Can We Trust the Data?
  • 83. Too Much Manual Maintenance Work
  • 84. - Adjusting queue size - Restart machines - Relocating shards Too Much Manual Maintenance Work
  • 87. Ongoing Work for the Future
  • 88. - Strong reliability - Strong consistency among replicas - Multi-tenancy Future Work
  • 89. Summary - Three dimensions of scaling: ingestion, query, and operations - Be simple and practical: successful systems emerge from simple ones - Abstraction and data modeling matter - Invest in thorough instrumentation - Invest in automation and tools
  • 90. Watch the video with slide synchronization on InfoQ.com! https://www.infoq.com/presentations/ uber-elasticsearch-clusters