SlideShare une entreprise Scribd logo
1  sur  35
Scaling Marketplaces at Thumbtack
QCon SF 2017
Nate Kupp
Technical Infrastructure
Data Eng, Experimentation, Platform
Infrastructure, Security, Dev Tools
InfoQ.com: News & Community Site
• Over 1,000,000 software developers, architects and CTOs read the site world-
wide every month
• 250,000 senior developers subscribe to our weekly newsletter
• Published in 4 languages (English, Chinese, Japanese and Brazilian
Portuguese)
• Post content from our QCon conferences
• 2 dedicated podcast channels: The InfoQ Podcast, with a focus on
Architecture and The Engineering Culture Podcast, with a focus on building
• 96 deep dives on innovative topics packed as downloadable emags and
minibooks
• Over 40 new content items per week
Watch the video with slide
synchronization on InfoQ.com!
https://www.infoq.com/presentations/
thumbtack-scalability
Purpose of QCon
- to empower software development by facilitating the spread of
knowledge and innovation
Strategy
- practitioner-driven conference designed for YOU: influencers of
change and innovation in your teams
- speakers and topics driving the evolution and innovation
- connecting and catalyzing the influencers and innovators
Highlights
- attended by more than 12,000 delegates since 2007
- held in 9 cities worldwide
Presented at QCon San Francisco
www.qconsf.com
Infrastructure from early beginnings
“You see that?”
my dad would say,
“That won’t last more than a
few years.
They’re going to have to rip the
whole thing out within a decade”.
You have choices. A lot of them.
There’s no right
answer. Sorry.
ME: HARDWARE SOFTWARE
Semiconductor manufacturing research
Apple iOS battery life
Data Infrastructure @ Thumbtack
Technical Infrastructure @ Thumbtack
Data, core platform, A/B testing & dev tools
Yup, those are working transistors
Growth Hacking with Team Philippines
Customer
Pro A
Pro B
Pro C
I need a
plumber!
Is this a spam
request?
SPAM DETECTION
Can we automate
quoting?
“QUOTING SERVICE”
Which pros do we
invite to quote?
MATCHING
Hacky (especially in the early days) is fine, even
necessary
Hacky is going to happen. With or without you.
“It has long been an axiom of mine
that the little things are infinitely
the most important.”
- Sherlock Holmes
“Small data”
CSV files Upload Web UI
Analysts
“Big data”
Hive/Impala
Change doesn’t happen in a vacuum.
Give people a way to get things done while
you build or migrate.
Late 2015
Starting to hit
“real” scale
Growth @ Thumbtack
2009
Thumbtack
founded
2014
Raise
$130mm
Dec. 2014
I join
Thumbtack
Now
Continued
growth
Mid-2015
Early data infrastructure
2009 - 2014
The monolith
2015
Finally on AWS! Started
using DynamoDB, Golang
microservices
The Evolution of Thumbtack’s Infrastructure
Go servicesWebsite Website
Sqoop ETL
HDFS
Event Logging
Today
Mixed clouds,
fully-containerized
infrastructure, terraform
The Evolution of Thumbtack’s Infrastructure
Production Serving Infrastructure Data Infrastructure
Application Layer
(Docker on ECS)
PHP, Go, Scala
Storage Layer
PostgreSQL, DynamoDB,
Elasticsearch
Processing
Scala/Spark on Dataproc
Storage
GCS
SQL
BigQuery
pgbouncer
Master + Hot
standbys
Production Serving Infrastructure
ECS
Go services
services
Website
Latency
Cluster
Throughput
Cluster
...
Sqoop
Cluster
WAL-E Backups
EBS + ZFS
Sqoop ETL
Application
Layer Offline Data Processing
DynamoDB ETL
Indexing Pipelines
Event Logging
PG Logging
Storage
Layer
Application Layer
Storage
Cloud
Storage
Ingest
Cloud
Pub/Sub
Analytics
BigQuery
Data Infrastructure
Spark ETL
Cloud Dataproc
Airflow
Thrift/JSON
events
Search
Indexing
Pipelines
Pipeline
Orchestration
Full table
Sqoop ETL
Batch ETL
Cloud Dataproc
Job-scoped clusters on
Dataproc 1.1 / Spark 2.0.2,
Scala
Data Retention
Data retained forever
on GCS / BigQueryCustom DynamoDB ETL
Internal DynamoDB ETL
pipelines, scales up/down
read capacity
Sqoop
Cloud Dataproc
Pipelines
ML Pipelines
And it was great!
- Costs comparable to monolithic cluster
- Creating and destroying > 600 clusters per day
- > 12,000 BigQuery queries per workday
But, getting there took some real work
HDFS
Storage
Cloud
Storage
Analytics
BigQuery
Code Changes
- Add retries everywhere (many 500/503 error codes)
- Rewrite HDFS utilities
- Change all Parquet to Avro
- Educate team (100s of users) on new SQL dialect
and web interface
And, still needed to build interfaces to managed services
Ingest
Cloud
Pub/Sub
Go services
services
Website
Analytics
BigQuery
STREAMING PIPELINEDATA
INGEST
Spark Streaming
Cloud Dataproc
Spark Streaming
Wrote custom Spark receiver for Cloud
Pub/Sub, checkpointing WAL to local
HDFS, no backpressure (yet)
End-to-end Durability
Built internal solutions; achieving end-to-end
durability guarantees here is hard
BigQuery Streaming APIs
Streaming writes of O(100s of millions)
of events / day into BigQuery tables
+ Uptime: GCP uptime has actually been great
- Unexpected BigQuery API Changes: Google occasionally makes production changes
which break us. When these happen, we can’t really do anything until fixed by Google or a
workaround is identified.
“Here be dragons”
April 2017
BigQuery Avro Ingest API Changes
Previously, a field marked as required by the Avro
schema could be loaded into a table with the field
marked nullable; this started failing.
October 2017
BigQuery Sharded Export Changes
Noticed many hung Dataproc clusters. Team identified
workaround to disable BQ sharded export by setting
mapred.bq.input.sharded.export.enable to false.
Managed services make some things easier…
But only some things.
Go in with eyes wide open.
Our first managed service: DynamoDB
Go services
Website
DynamoDB table proliferation
GCP Migration & BigQuery
Why? A few reasons...
1. Fully-managed, serverless, petabyte-scale data
warehouse
2. Nested/complex data types
3. Security & access controls
4. Self-serve interface to Google Sheets
Surprise dependencies
Exception: BigQuery job failed. Final error was:
{
'reason': 'invalid',
'message':"Could not parse '.' as int for field category_id
(position 0) starting at location 340 ",
'location': '/gdrive/home/Category taxonomy - master file'
}
When you make it really easy to deploy infrastructure...
...you get A LOT of random infrastructure
What does that mean for Thumbtack & Serverless?
Hacky is going to happen. With or without you.
Empower the team while limiting the wild west.
QUESTIONS?
Watch the video with slide
synchronization on InfoQ.com!
https://www.infoq.com/presentations/
thumbtack-scalability

Contenu connexe

Plus de C4Media

Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDC4Media
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine LearningC4Media
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at SpeedC4Media
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsC4Media
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsC4Media
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerC4Media
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleC4Media
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeC4Media
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereC4Media
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing ForC4Media
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data EngineeringC4Media
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreC4Media
 
Navigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery TeamsNavigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery TeamsC4Media
 
High Performance Cooperative Distributed Systems in Adtech
High Performance Cooperative Distributed Systems in AdtechHigh Performance Cooperative Distributed Systems in Adtech
High Performance Cooperative Distributed Systems in AdtechC4Media
 
Rust's Journey to Async/await
Rust's Journey to Async/awaitRust's Journey to Async/await
Rust's Journey to Async/awaitC4Media
 
Opportunities and Pitfalls of Event-Driven Utopia
Opportunities and Pitfalls of Event-Driven UtopiaOpportunities and Pitfalls of Event-Driven Utopia
Opportunities and Pitfalls of Event-Driven UtopiaC4Media
 
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayDatadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayC4Media
 
Are We Really Cloud-Native?
Are We Really Cloud-Native?Are We Really Cloud-Native?
Are We Really Cloud-Native?C4Media
 
CockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL DatabaseCockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL DatabaseC4Media
 
A Dive into Streams @LinkedIn with Brooklin
A Dive into Streams @LinkedIn with BrooklinA Dive into Streams @LinkedIn with Brooklin
A Dive into Streams @LinkedIn with BrooklinC4Media
 

Plus de C4Media (20)

Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CD
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine Learning
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at Speed
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep Systems
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.js
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly Compiler
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix Scale
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's Edge
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home Everywhere
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing For
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data Engineering
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
 
Navigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery TeamsNavigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery Teams
 
High Performance Cooperative Distributed Systems in Adtech
High Performance Cooperative Distributed Systems in AdtechHigh Performance Cooperative Distributed Systems in Adtech
High Performance Cooperative Distributed Systems in Adtech
 
Rust's Journey to Async/await
Rust's Journey to Async/awaitRust's Journey to Async/await
Rust's Journey to Async/await
 
Opportunities and Pitfalls of Event-Driven Utopia
Opportunities and Pitfalls of Event-Driven UtopiaOpportunities and Pitfalls of Event-Driven Utopia
Opportunities and Pitfalls of Event-Driven Utopia
 
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayDatadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
 
Are We Really Cloud-Native?
Are We Really Cloud-Native?Are We Really Cloud-Native?
Are We Really Cloud-Native?
 
CockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL DatabaseCockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL Database
 
A Dive into Streams @LinkedIn with Brooklin
A Dive into Streams @LinkedIn with BrooklinA Dive into Streams @LinkedIn with Brooklin
A Dive into Streams @LinkedIn with Brooklin
 

Dernier

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 

Dernier (20)

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 

Scaling Marketplaces at Thumbtack

  • 1. Scaling Marketplaces at Thumbtack QCon SF 2017 Nate Kupp Technical Infrastructure Data Eng, Experimentation, Platform Infrastructure, Security, Dev Tools
  • 2. InfoQ.com: News & Community Site • Over 1,000,000 software developers, architects and CTOs read the site world- wide every month • 250,000 senior developers subscribe to our weekly newsletter • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • 2 dedicated podcast channels: The InfoQ Podcast, with a focus on Architecture and The Engineering Culture Podcast, with a focus on building • 96 deep dives on innovative topics packed as downloadable emags and minibooks • Over 40 new content items per week Watch the video with slide synchronization on InfoQ.com! https://www.infoq.com/presentations/ thumbtack-scalability
  • 3. Purpose of QCon - to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide Presented at QCon San Francisco www.qconsf.com
  • 5. “You see that?” my dad would say, “That won’t last more than a few years. They’re going to have to rip the whole thing out within a decade”.
  • 6. You have choices. A lot of them.
  • 8. ME: HARDWARE SOFTWARE Semiconductor manufacturing research Apple iOS battery life Data Infrastructure @ Thumbtack Technical Infrastructure @ Thumbtack Data, core platform, A/B testing & dev tools Yup, those are working transistors
  • 9.
  • 10.
  • 11. Growth Hacking with Team Philippines Customer Pro A Pro B Pro C I need a plumber! Is this a spam request? SPAM DETECTION Can we automate quoting? “QUOTING SERVICE” Which pros do we invite to quote? MATCHING
  • 12. Hacky (especially in the early days) is fine, even necessary
  • 13. Hacky is going to happen. With or without you.
  • 14. “It has long been an axiom of mine that the little things are infinitely the most important.” - Sherlock Holmes “Small data” CSV files Upload Web UI Analysts “Big data” Hive/Impala
  • 15. Change doesn’t happen in a vacuum. Give people a way to get things done while you build or migrate.
  • 16. Late 2015 Starting to hit “real” scale Growth @ Thumbtack 2009 Thumbtack founded 2014 Raise $130mm Dec. 2014 I join Thumbtack Now Continued growth
  • 17. Mid-2015 Early data infrastructure 2009 - 2014 The monolith 2015 Finally on AWS! Started using DynamoDB, Golang microservices The Evolution of Thumbtack’s Infrastructure Go servicesWebsite Website Sqoop ETL HDFS Event Logging
  • 18. Today Mixed clouds, fully-containerized infrastructure, terraform The Evolution of Thumbtack’s Infrastructure Production Serving Infrastructure Data Infrastructure Application Layer (Docker on ECS) PHP, Go, Scala Storage Layer PostgreSQL, DynamoDB, Elasticsearch Processing Scala/Spark on Dataproc Storage GCS SQL BigQuery
  • 19. pgbouncer Master + Hot standbys Production Serving Infrastructure ECS Go services services Website Latency Cluster Throughput Cluster ... Sqoop Cluster WAL-E Backups EBS + ZFS Sqoop ETL Application Layer Offline Data Processing DynamoDB ETL Indexing Pipelines Event Logging PG Logging Storage Layer
  • 20. Application Layer Storage Cloud Storage Ingest Cloud Pub/Sub Analytics BigQuery Data Infrastructure Spark ETL Cloud Dataproc Airflow Thrift/JSON events Search Indexing Pipelines Pipeline Orchestration Full table Sqoop ETL Batch ETL Cloud Dataproc Job-scoped clusters on Dataproc 1.1 / Spark 2.0.2, Scala Data Retention Data retained forever on GCS / BigQueryCustom DynamoDB ETL Internal DynamoDB ETL pipelines, scales up/down read capacity Sqoop Cloud Dataproc Pipelines ML Pipelines
  • 21. And it was great! - Costs comparable to monolithic cluster - Creating and destroying > 600 clusters per day - > 12,000 BigQuery queries per workday
  • 22. But, getting there took some real work HDFS Storage Cloud Storage Analytics BigQuery Code Changes - Add retries everywhere (many 500/503 error codes) - Rewrite HDFS utilities - Change all Parquet to Avro - Educate team (100s of users) on new SQL dialect and web interface
  • 23. And, still needed to build interfaces to managed services Ingest Cloud Pub/Sub Go services services Website Analytics BigQuery STREAMING PIPELINEDATA INGEST Spark Streaming Cloud Dataproc Spark Streaming Wrote custom Spark receiver for Cloud Pub/Sub, checkpointing WAL to local HDFS, no backpressure (yet) End-to-end Durability Built internal solutions; achieving end-to-end durability guarantees here is hard BigQuery Streaming APIs Streaming writes of O(100s of millions) of events / day into BigQuery tables
  • 24. + Uptime: GCP uptime has actually been great - Unexpected BigQuery API Changes: Google occasionally makes production changes which break us. When these happen, we can’t really do anything until fixed by Google or a workaround is identified. “Here be dragons” April 2017 BigQuery Avro Ingest API Changes Previously, a field marked as required by the Avro schema could be loaded into a table with the field marked nullable; this started failing. October 2017 BigQuery Sharded Export Changes Noticed many hung Dataproc clusters. Team identified workaround to disable BQ sharded export by setting mapred.bq.input.sharded.export.enable to false.
  • 25. Managed services make some things easier… But only some things. Go in with eyes wide open.
  • 26. Our first managed service: DynamoDB Go services Website
  • 28. GCP Migration & BigQuery Why? A few reasons... 1. Fully-managed, serverless, petabyte-scale data warehouse 2. Nested/complex data types 3. Security & access controls 4. Self-serve interface to Google Sheets
  • 29. Surprise dependencies Exception: BigQuery job failed. Final error was: { 'reason': 'invalid', 'message':"Could not parse '.' as int for field category_id (position 0) starting at location 340 ", 'location': '/gdrive/home/Category taxonomy - master file' }
  • 30. When you make it really easy to deploy infrastructure... ...you get A LOT of random infrastructure
  • 31. What does that mean for Thumbtack & Serverless?
  • 32. Hacky is going to happen. With or without you.
  • 33. Empower the team while limiting the wild west.
  • 35. Watch the video with slide synchronization on InfoQ.com! https://www.infoq.com/presentations/ thumbtack-scalability