SlideShare une entreprise Scribd logo
1  sur  12
Michael Stoppelman, VP of Engineering 
@stopman 
SCALE: Traffic 
October 23, 2014
In the beginning of time… 2005 - 2007
Traffic History at Yelp 
• 2005 - 2007 
– ~200k searches/day 
– ~850k reviews total 
– www.yelp.com 
– LAMP (P - Python) 
– Pushes were daily 
– Master-Slave MySQL setup, 1 main database, 
mix of InnoDB and MyISAM tables 
– Turned on gzip on Apache 
– Squid for image proxying
As we grew so did our infrastructure… 2008 - 2009
Traffic History at Yelp 
• 2008 - 2009 
– Search scaling 
• sharded by geography 
• index distribution: rsync w/ fadvise 
– Log aggregation: 
• Syslog + rsync + s3 
• scribe + s3 
– Gearman used for async queue processing 
– MySQL split vertically into 3 databases 
– Dirty session cookie 
– Mobile apps: iPhone, Android, Blackberry, 
etc… 
– 4 countries
Scale scale scale… 2010 - 2011
Infrastructure History at Yelp 
• 2010 - 2011 
– Introduced “read only” mode for the 
site 
– First CDN put into use 
– Photos migrated to s3 
– mrjob is built/open sourced 
– AWS EMR is used for mrjob 
processing 
– Managed DNS - DynDNS 
– 13 countries
IPO 2012, more scale!
Infrastructure History at Yelp 
• 2012 - 2013 
– Introduced “read only” datacenters 
• Cacheserv introduced 
– Load balance traffic between 
datacenters 
– Elasticsearch 
– Pre-IPO traffic, we added the ability to 
quickly reduce load 
– Direct connection with AWS 
– All schema changes are done, online 
– Moved to all FusionIO for DB hosts 
– 24 countries
Biggest year yet… 2014 - present
Traffic Infrastructure - current picture 
• 2014 - current 
– abusive scraping 
– Starting to serve traffic from EC2 (for 
Asia/Europe) 
– Elasticsearch, Logstash, and Kibana 
– Gearman w/ MySQL 
– Kafka == Scribe 
– Pyleus (doing work in real-time) 
– 29 countries
Pyleus: A Python Framework for Storm Topologies 
● Pyleus: Yelp’s super new Python Storm bindings 
● Build topologies in Python 
● Declaratively describe structure in YAML 
● Respects requirements.txt 
● Compose a topology from Python packaged components!

Contenu connexe

Tendances

Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014Chris Fregly
 
Building Realtime Data Pipelines with Kafka Connect and Spark Streaming
Building Realtime Data Pipelines with Kafka Connect and Spark StreamingBuilding Realtime Data Pipelines with Kafka Connect and Spark Streaming
Building Realtime Data Pipelines with Kafka Connect and Spark StreamingJen Aman
 
SQL Server on Google Cloud Platform
SQL Server on Google Cloud PlatformSQL Server on Google Cloud Platform
SQL Server on Google Cloud PlatformLynn Langit
 
BDT303 Data Science with Elastic MapReduce - AWS re: Invent 2012
BDT303 Data Science with Elastic MapReduce - AWS re: Invent 2012BDT303 Data Science with Elastic MapReduce - AWS re: Invent 2012
BDT303 Data Science with Elastic MapReduce - AWS re: Invent 2012Amazon Web Services
 
An overview of Amazon Athena
An overview of Amazon AthenaAn overview of Amazon Athena
An overview of Amazon AthenaJulien SIMON
 
AWS Batch: Simplifying batch computing in the cloud
AWS Batch: Simplifying batch computing in the cloudAWS Batch: Simplifying batch computing in the cloud
AWS Batch: Simplifying batch computing in the cloudAdrian Hornsby
 
Introduction to Amazon Kinesis Analytics
Introduction to Amazon Kinesis AnalyticsIntroduction to Amazon Kinesis Analytics
Introduction to Amazon Kinesis AnalyticsAmazon Web Services
 
Data Analysis on AWS
Data Analysis on AWSData Analysis on AWS
Data Analysis on AWSPaolo latella
 
(BDT310) Big Data Architectural Patterns and Best Practices on AWS
(BDT310) Big Data Architectural Patterns and Best Practices on AWS(BDT310) Big Data Architectural Patterns and Best Practices on AWS
(BDT310) Big Data Architectural Patterns and Best Practices on AWSAmazon Web Services
 
Build Your Web Analytics with node.js, Amazon DynamoDB and Amazon EMR (BDT203...
Build Your Web Analytics with node.js, Amazon DynamoDB and Amazon EMR (BDT203...Build Your Web Analytics with node.js, Amazon DynamoDB and Amazon EMR (BDT203...
Build Your Web Analytics with node.js, Amazon DynamoDB and Amazon EMR (BDT203...Amazon Web Services
 
BigData: AWS RedShift with S3, EC2
BigData: AWS RedShift with S3, EC2BigData: AWS RedShift with S3, EC2
BigData: AWS RedShift with S3, EC2Paulraj Pappaiah
 
Bleeding Edge Databases
Bleeding Edge DatabasesBleeding Edge Databases
Bleeding Edge DatabasesLynn Langit
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Batch Processing with Containers on AWS - June 2017 AWS Online Tech Talks
Batch Processing with Containers on AWS -  June 2017 AWS Online Tech TalksBatch Processing with Containers on AWS -  June 2017 AWS Online Tech Talks
Batch Processing with Containers on AWS - June 2017 AWS Online Tech TalksAmazon Web Services
 
Getting Started with Amazon Redshift - AWS July 2016 Webinar Series
Getting Started with Amazon Redshift - AWS July 2016 Webinar SeriesGetting Started with Amazon Redshift - AWS July 2016 Webinar Series
Getting Started with Amazon Redshift - AWS July 2016 Webinar SeriesAmazon Web Services
 
Amazon EMR Facebook Presto Meetup
Amazon EMR Facebook Presto MeetupAmazon EMR Facebook Presto Meetup
Amazon EMR Facebook Presto Meetupstevemcpherson
 
AWS for the Data Professional
AWS for the Data ProfessionalAWS for the Data Professional
AWS for the Data ProfessionalLynn Langit
 

Tendances (20)

Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
 
Building Realtime Data Pipelines with Kafka Connect and Spark Streaming
Building Realtime Data Pipelines with Kafka Connect and Spark StreamingBuilding Realtime Data Pipelines with Kafka Connect and Spark Streaming
Building Realtime Data Pipelines with Kafka Connect and Spark Streaming
 
Big Data Architectural Patterns
Big Data Architectural PatternsBig Data Architectural Patterns
Big Data Architectural Patterns
 
SQL Server on Google Cloud Platform
SQL Server on Google Cloud PlatformSQL Server on Google Cloud Platform
SQL Server on Google Cloud Platform
 
BDT303 Data Science with Elastic MapReduce - AWS re: Invent 2012
BDT303 Data Science with Elastic MapReduce - AWS re: Invent 2012BDT303 Data Science with Elastic MapReduce - AWS re: Invent 2012
BDT303 Data Science with Elastic MapReduce - AWS re: Invent 2012
 
An overview of Amazon Athena
An overview of Amazon AthenaAn overview of Amazon Athena
An overview of Amazon Athena
 
AWS Batch: Simplifying batch computing in the cloud
AWS Batch: Simplifying batch computing in the cloudAWS Batch: Simplifying batch computing in the cloud
AWS Batch: Simplifying batch computing in the cloud
 
Introduction to Amazon Kinesis Analytics
Introduction to Amazon Kinesis AnalyticsIntroduction to Amazon Kinesis Analytics
Introduction to Amazon Kinesis Analytics
 
Data Analysis on AWS
Data Analysis on AWSData Analysis on AWS
Data Analysis on AWS
 
(BDT310) Big Data Architectural Patterns and Best Practices on AWS
(BDT310) Big Data Architectural Patterns and Best Practices on AWS(BDT310) Big Data Architectural Patterns and Best Practices on AWS
(BDT310) Big Data Architectural Patterns and Best Practices on AWS
 
Build Your Web Analytics with node.js, Amazon DynamoDB and Amazon EMR (BDT203...
Build Your Web Analytics with node.js, Amazon DynamoDB and Amazon EMR (BDT203...Build Your Web Analytics with node.js, Amazon DynamoDB and Amazon EMR (BDT203...
Build Your Web Analytics with node.js, Amazon DynamoDB and Amazon EMR (BDT203...
 
BigData: AWS RedShift with S3, EC2
BigData: AWS RedShift with S3, EC2BigData: AWS RedShift with S3, EC2
BigData: AWS RedShift with S3, EC2
 
Bleeding Edge Databases
Bleeding Edge DatabasesBleeding Edge Databases
Bleeding Edge Databases
 
DynamodbDB Deep Dive
DynamodbDB Deep DiveDynamodbDB Deep Dive
DynamodbDB Deep Dive
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Batch Processing with Containers on AWS - June 2017 AWS Online Tech Talks
Batch Processing with Containers on AWS -  June 2017 AWS Online Tech TalksBatch Processing with Containers on AWS -  June 2017 AWS Online Tech Talks
Batch Processing with Containers on AWS - June 2017 AWS Online Tech Talks
 
Getting Started with Amazon Redshift - AWS July 2016 Webinar Series
Getting Started with Amazon Redshift - AWS July 2016 Webinar SeriesGetting Started with Amazon Redshift - AWS July 2016 Webinar Series
Getting Started with Amazon Redshift - AWS July 2016 Webinar Series
 
Amazon EMR Facebook Presto Meetup
Amazon EMR Facebook Presto MeetupAmazon EMR Facebook Presto Meetup
Amazon EMR Facebook Presto Meetup
 
AWS for the Data Professional
AWS for the Data ProfessionalAWS for the Data Professional
AWS for the Data Professional
 
AWS_Data_Pipeline
AWS_Data_PipelineAWS_Data_Pipeline
AWS_Data_Pipeline
 

En vedette

DockerCon SF 2015: Faster, Cheaper, Safer
DockerCon SF 2015: Faster, Cheaper, SaferDockerCon SF 2015: Faster, Cheaper, Safer
DockerCon SF 2015: Faster, Cheaper, SaferDocker, Inc.
 
"Using ElasticSearch to Scale Near Real-Time Search" by John Billings (Presen...
"Using ElasticSearch to Scale Near Real-Time Search" by John Billings (Presen..."Using ElasticSearch to Scale Near Real-Time Search" by John Billings (Presen...
"Using ElasticSearch to Scale Near Real-Time Search" by John Billings (Presen...Yelp Engineering
 
How Yelp does Service Discovery
How Yelp does Service DiscoveryHow Yelp does Service Discovery
How Yelp does Service DiscoveryJohn Billings
 
"Optimal Learning for Fun and Profit" by Scott Clark (Presented at The Yelp E...
"Optimal Learning for Fun and Profit" by Scott Clark (Presented at The Yelp E..."Optimal Learning for Fun and Profit" by Scott Clark (Presented at The Yelp E...
"Optimal Learning for Fun and Profit" by Scott Clark (Presented at The Yelp E...Yelp Engineering
 
Yelp Academic Dataset
Yelp Academic DatasetYelp Academic Dataset
Yelp Academic DatasetMandaniKeyur
 
Building a smarter application Stack by Tomas Doran from Yelp
Building a smarter application Stack by Tomas Doran from YelpBuilding a smarter application Stack by Tomas Doran from Yelp
Building a smarter application Stack by Tomas Doran from YelpdotCloud
 
Hybrid MongoDB and RDBMS Applications
Hybrid MongoDB and RDBMS ApplicationsHybrid MongoDB and RDBMS Applications
Hybrid MongoDB and RDBMS ApplicationsSteven Francia
 

En vedette (10)

DockerCon SF 2015: Faster, Cheaper, Safer
DockerCon SF 2015: Faster, Cheaper, SaferDockerCon SF 2015: Faster, Cheaper, Safer
DockerCon SF 2015: Faster, Cheaper, Safer
 
MySQL At Yelp
MySQL At YelpMySQL At Yelp
MySQL At Yelp
 
"Using ElasticSearch to Scale Near Real-Time Search" by John Billings (Presen...
"Using ElasticSearch to Scale Near Real-Time Search" by John Billings (Presen..."Using ElasticSearch to Scale Near Real-Time Search" by John Billings (Presen...
"Using ElasticSearch to Scale Near Real-Time Search" by John Billings (Presen...
 
How Yelp does Service Discovery
How Yelp does Service DiscoveryHow Yelp does Service Discovery
How Yelp does Service Discovery
 
"Optimal Learning for Fun and Profit" by Scott Clark (Presented at The Yelp E...
"Optimal Learning for Fun and Profit" by Scott Clark (Presented at The Yelp E..."Optimal Learning for Fun and Profit" by Scott Clark (Presented at The Yelp E...
"Optimal Learning for Fun and Profit" by Scott Clark (Presented at The Yelp E...
 
Giving Design Critique
Giving Design CritiqueGiving Design Critique
Giving Design Critique
 
Yelp Academic Dataset
Yelp Academic DatasetYelp Academic Dataset
Yelp Academic Dataset
 
Humans by the hundred
Humans by the hundredHumans by the hundred
Humans by the hundred
 
Building a smarter application Stack by Tomas Doran from Yelp
Building a smarter application Stack by Tomas Doran from YelpBuilding a smarter application Stack by Tomas Doran from Yelp
Building a smarter application Stack by Tomas Doran from Yelp
 
Hybrid MongoDB and RDBMS Applications
Hybrid MongoDB and RDBMS ApplicationsHybrid MongoDB and RDBMS Applications
Hybrid MongoDB and RDBMS Applications
 

Similaire à Michael Stoppelman Shares Yelp's Infrastructure History and Current Traffic Scaling Practices

FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastru...
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastru...FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastru...
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastru...Mohamed Sayed
 
16 months @ SoundCloud
16 months @ SoundCloud16 months @ SoundCloud
16 months @ SoundCloudTobias Schmidt
 
Application design for the cloud using AWS
Application design for the cloud using AWSApplication design for the cloud using AWS
Application design for the cloud using AWSJonathan Holloway
 
AWS re:Invent 2013 Recap
AWS re:Invent 2013 RecapAWS re:Invent 2013 Recap
AWS re:Invent 2013 RecapBarry Jones
 
AWS for Start-ups - Case Study - PeoplePerHour
AWS for Start-ups - Case Study - PeoplePerHour AWS for Start-ups - Case Study - PeoplePerHour
AWS for Start-ups - Case Study - PeoplePerHour Amazon Web Services
 
AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09Chris Purrington
 
Netflix web-adrian-qcon
Netflix web-adrian-qconNetflix web-adrian-qcon
Netflix web-adrian-qconYiwei Ma
 
STG316_Optimizing Storage for Big Data Workloads
STG316_Optimizing Storage for Big Data WorkloadsSTG316_Optimizing Storage for Big Data Workloads
STG316_Optimizing Storage for Big Data WorkloadsAmazon Web Services
 
Real-Time Analytics with Spark and MemSQL
Real-Time Analytics with Spark and MemSQLReal-Time Analytics with Spark and MemSQL
Real-Time Analytics with Spark and MemSQLSingleStore
 
Data warehouse solutions
Data warehouse solutionsData warehouse solutions
Data warehouse solutionsTu Pham
 
OSOM Operations in the Cloud
OSOM Operations in the CloudOSOM Operations in the Cloud
OSOM Operations in the Cloudmstuparu
 
OSOM - Operations in the Cloud
OSOM - Operations in the CloudOSOM - Operations in the Cloud
OSOM - Operations in the CloudMarcela Oniga
 
Building a Scalable Web Crawler with Hadoop
Building a Scalable Web Crawler with HadoopBuilding a Scalable Web Crawler with Hadoop
Building a Scalable Web Crawler with HadoopHadoop User Group
 
Netflix in the Cloud at SV Forum
Netflix in the Cloud at SV ForumNetflix in the Cloud at SV Forum
Netflix in the Cloud at SV ForumAdrian Cockcroft
 
Building Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftBuilding Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftAmazon Web Services
 
Ceate a Scalable Cloud Architecture
Ceate a Scalable Cloud ArchitectureCeate a Scalable Cloud Architecture
Ceate a Scalable Cloud ArchitectureAmazon Web Services
 

Similaire à Michael Stoppelman Shares Yelp's Infrastructure History and Current Traffic Scaling Practices (20)

Architecture at PBS
Architecture at PBSArchitecture at PBS
Architecture at PBS
 
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastru...
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastru...FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastru...
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastru...
 
16 months @ SoundCloud
16 months @ SoundCloud16 months @ SoundCloud
16 months @ SoundCloud
 
Application design for the cloud using AWS
Application design for the cloud using AWSApplication design for the cloud using AWS
Application design for the cloud using AWS
 
AWS re:Invent 2013 Recap
AWS re:Invent 2013 RecapAWS re:Invent 2013 Recap
AWS re:Invent 2013 Recap
 
AWS for Start-ups - Case Study - PeoplePerHour
AWS for Start-ups - Case Study - PeoplePerHour AWS for Start-ups - Case Study - PeoplePerHour
AWS for Start-ups - Case Study - PeoplePerHour
 
AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09AWS (Hadoop) Meetup 30.04.09
AWS (Hadoop) Meetup 30.04.09
 
Netflix web-adrian-qcon
Netflix web-adrian-qconNetflix web-adrian-qcon
Netflix web-adrian-qcon
 
STG316_Optimizing Storage for Big Data Workloads
STG316_Optimizing Storage for Big Data WorkloadsSTG316_Optimizing Storage for Big Data Workloads
STG316_Optimizing Storage for Big Data Workloads
 
Global Netflix Platform
Global Netflix PlatformGlobal Netflix Platform
Global Netflix Platform
 
Real-Time Analytics with Spark and MemSQL
Real-Time Analytics with Spark and MemSQLReal-Time Analytics with Spark and MemSQL
Real-Time Analytics with Spark and MemSQL
 
Data warehouse solutions
Data warehouse solutionsData warehouse solutions
Data warehouse solutions
 
OSOM Operations in the Cloud
OSOM Operations in the CloudOSOM Operations in the Cloud
OSOM Operations in the Cloud
 
OSOM - Operations in the Cloud
OSOM - Operations in the CloudOSOM - Operations in the Cloud
OSOM - Operations in the Cloud
 
Common crawlpresentation
Common crawlpresentationCommon crawlpresentation
Common crawlpresentation
 
Building a Scalable Web Crawler with Hadoop
Building a Scalable Web Crawler with HadoopBuilding a Scalable Web Crawler with Hadoop
Building a Scalable Web Crawler with Hadoop
 
Netflix in the Cloud at SV Forum
Netflix in the Cloud at SV ForumNetflix in the Cloud at SV Forum
Netflix in the Cloud at SV Forum
 
Building Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftBuilding Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon Redshift
 
Ceate a Scalable Cloud Architecture
Ceate a Scalable Cloud ArchitectureCeate a Scalable Cloud Architecture
Ceate a Scalable Cloud Architecture
 
[Jun AWS 201] Technical Workshop
[Jun AWS 201] Technical Workshop[Jun AWS 201] Technical Workshop
[Jun AWS 201] Technical Workshop
 

Plus de Yelp Engineering

Teeing Up Python - Code Golf
Teeing Up Python - Code GolfTeeing Up Python - Code Golf
Teeing Up Python - Code GolfYelp Engineering
 
Building a World Class Security Team
Building a World Class Security TeamBuilding a World Class Security Team
Building a World Class Security TeamYelp Engineering
 
Microservices Summit - The Human Side of Services
Microservices Summit - The Human Side of ServicesMicroservices Summit - The Human Side of Services
Microservices Summit - The Human Side of ServicesYelp Engineering
 
Humans by the hundred (DevOps Days Ohio)
Humans by the hundred (DevOps Days Ohio)Humans by the hundred (DevOps Days Ohio)
Humans by the hundred (DevOps Days Ohio)Yelp Engineering
 
Yelp Tech Talks: Mobile Testing 1, 2, 3
Yelp Tech Talks: Mobile Testing 1, 2, 3Yelp Tech Talks: Mobile Testing 1, 2, 3
Yelp Tech Talks: Mobile Testing 1, 2, 3Yelp Engineering
 
Ensuring Consistency in a Replicated World
Ensuring Consistency in a Replicated WorldEnsuring Consistency in a Replicated World
Ensuring Consistency in a Replicated WorldYelp Engineering
 
A Beginners Guide To Launching Yelp In Hong Kong
A Beginners Guide To Launching Yelp In Hong KongA Beginners Guide To Launching Yelp In Hong Kong
A Beginners Guide To Launching Yelp In Hong KongYelp Engineering
 
Optimal Learning for Fun and Profit with MOE
Optimal Learning for Fun and Profit with MOEOptimal Learning for Fun and Profit with MOE
Optimal Learning for Fun and Profit with MOEYelp Engineering
 

Plus de Yelp Engineering (11)

Human Ops
Human OpsHuman Ops
Human Ops
 
Teeing Up Python - Code Golf
Teeing Up Python - Code GolfTeeing Up Python - Code Golf
Teeing Up Python - Code Golf
 
Fluxx Streaming
Fluxx StreamingFluxx Streaming
Fluxx Streaming
 
Building a World Class Security Team
Building a World Class Security TeamBuilding a World Class Security Team
Building a World Class Security Team
 
Microservices Summit - The Human Side of Services
Microservices Summit - The Human Side of ServicesMicroservices Summit - The Human Side of Services
Microservices Summit - The Human Side of Services
 
Humans by the hundred (DevOps Days Ohio)
Humans by the hundred (DevOps Days Ohio)Humans by the hundred (DevOps Days Ohio)
Humans by the hundred (DevOps Days Ohio)
 
Yelp Tech Talks: Mobile Testing 1, 2, 3
Yelp Tech Talks: Mobile Testing 1, 2, 3Yelp Tech Talks: Mobile Testing 1, 2, 3
Yelp Tech Talks: Mobile Testing 1, 2, 3
 
Ensuring Consistency in a Replicated World
Ensuring Consistency in a Replicated WorldEnsuring Consistency in a Replicated World
Ensuring Consistency in a Replicated World
 
A Beginners Guide To Launching Yelp In Hong Kong
A Beginners Guide To Launching Yelp In Hong KongA Beginners Guide To Launching Yelp In Hong Kong
A Beginners Guide To Launching Yelp In Hong Kong
 
Own Your Career
Own Your CareerOwn Your Career
Own Your Career
 
Optimal Learning for Fun and Profit with MOE
Optimal Learning for Fun and Profit with MOEOptimal Learning for Fun and Profit with MOE
Optimal Learning for Fun and Profit with MOE
 

Dernier

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 

Dernier (20)

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

Michael Stoppelman Shares Yelp's Infrastructure History and Current Traffic Scaling Practices

  • 1. Michael Stoppelman, VP of Engineering @stopman SCALE: Traffic October 23, 2014
  • 2. In the beginning of time… 2005 - 2007
  • 3. Traffic History at Yelp • 2005 - 2007 – ~200k searches/day – ~850k reviews total – www.yelp.com – LAMP (P - Python) – Pushes were daily – Master-Slave MySQL setup, 1 main database, mix of InnoDB and MyISAM tables – Turned on gzip on Apache – Squid for image proxying
  • 4. As we grew so did our infrastructure… 2008 - 2009
  • 5. Traffic History at Yelp • 2008 - 2009 – Search scaling • sharded by geography • index distribution: rsync w/ fadvise – Log aggregation: • Syslog + rsync + s3 • scribe + s3 – Gearman used for async queue processing – MySQL split vertically into 3 databases – Dirty session cookie – Mobile apps: iPhone, Android, Blackberry, etc… – 4 countries
  • 6. Scale scale scale… 2010 - 2011
  • 7. Infrastructure History at Yelp • 2010 - 2011 – Introduced “read only” mode for the site – First CDN put into use – Photos migrated to s3 – mrjob is built/open sourced – AWS EMR is used for mrjob processing – Managed DNS - DynDNS – 13 countries
  • 8. IPO 2012, more scale!
  • 9. Infrastructure History at Yelp • 2012 - 2013 – Introduced “read only” datacenters • Cacheserv introduced – Load balance traffic between datacenters – Elasticsearch – Pre-IPO traffic, we added the ability to quickly reduce load – Direct connection with AWS – All schema changes are done, online – Moved to all FusionIO for DB hosts – 24 countries
  • 10. Biggest year yet… 2014 - present
  • 11. Traffic Infrastructure - current picture • 2014 - current – abusive scraping – Starting to serve traffic from EC2 (for Asia/Europe) – Elasticsearch, Logstash, and Kibana – Gearman w/ MySQL – Kafka == Scribe – Pyleus (doing work in real-time) – 29 countries
  • 12. Pyleus: A Python Framework for Storm Topologies ● Pyleus: Yelp’s super new Python Storm bindings ● Build topologies in Python ● Declaratively describe structure in YAML ● Respects requirements.txt ● Compose a topology from Python packaged components!