SlideShare une entreprise Scribd logo
1  sur  38
Télécharger pour lire hors ligne
2017 Software Architecture Summit
Scaling100PBDataWarehousein Cloud
Changshu LIU
Software Engineer @ Pinterest
10/2017
Pinterest :
The World’s Catalog
of Ideas
2017 Software Architecture Summit
Scale@Pinterest
Biz Scale
• 200M+ MAUs
• 120B+ Pins
• 3B+ Boards
Data Scale
• 140+ PB @ S3
• 6000+ Hive/Hadoop nodes
• 400+ Presto nodes
2017 Software Architecture Summit
Data@Pinterest
Agenda
1 Data Ingestion
2 Streaming
3 Hive
4 Presto
5 Workflow
2017 Software Architecture Summit
Data Ingestion - Overview
2017 Software Architecture Summit
Data Ingestion - Logging
Singer
- Logging Agent for Kafka
- Decouple logging and kafka
- Failure Isolation
- Buffering, Pooling
Merced
- Log Persistence and Transportation Service
- Kafka -> Storage Destinations
- S3, HBase and SQS
- Auto rebalancing, failure handling
https://hortonworks.com/apache/kafka/
2017 Software Architecture Summit
Data Ingestion - Full DB Dump
Tracker
● Daily db dump: Hadoop Streaming with mysqldump
● Hadoop node <-> MySQL node traffic pains
○ Balancing & Throttling
● Performance pains
○ Remote mysqldump
○ Python streaming
2017 Software Architecture Summit
Data Ingestion - Inc DB Dump
WaterMill
● Maxwell tails MySQL bin logs and writes to Kafka
○ Tail bin logs and write to kafka as json
● Mini-batch job to apply updates on base version
○ Distribute records by db shard id
○ Small (update) table join Big (base) table
● Alias table points to the latest snapshot
2017 Software Architecture Summit
Data Ingestion - Stats
Kafka Stats
● ~150B Messages/Day
● ~400T Bytes/Day
● ~1000 Kafka Topics
● ~1400 Kafka Brokers
Doctor Kafka https://github.com/pinterest/doctorkafka
DB Dump Stats
● ~80 Table/Day
● ~100T Bytes/Day
2017 Software Architecture Summit
Data Processing Overview
Agenda
1 Data Ingestion
2 Streaming
3 Hive
4 Presto
5 Workflow
2017 Software Architecture Summit
Streaming - Overview
2017 Software Architecture Summit
Streaming - Voracity
● Schema Validation
○ Filter out malformed records
● Late Arrival Event Handling
○ Move late event to proper bucket
● Parquet Conversion
○ Json/Thrift -> Parquet
Agenda
1 Data Ingestion
2 Streaming
3 Hive
4 Presto
5 Workflow
2017 Software Architecture Summit
Hive - Overview
AdHoc Query
Scheduled Query
2017 Software Architecture Summit
Hive - Scheduled Query
INSERT OVERWRITE DIRECTORY 's3://hive_query_output/cluster_name/20171019/application_1508263025827_2301/0'
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '001'
COLLECTION ITEMS TERMINATED BY '002'
MAP KEYS TERMINATED BY '003'
LINES TERMINATED BY 'n'
STORED AS TEXTFILE
SELECT country, count(*)
FROM db_users
GROUP BY country
SELECT country, count(*)
FROM db_users
GROUP BY country
Rewrite
● Query is submitted by System
● Based on Hive CLI
● Talking to MetaStore (MySQL)
● YARN REST API
○ DistShell: AppMaster + Worker
■ Hive CLI as pure shell command
○ Get stdout/stderr from NM REST API
○ Rewrite select query to insert to s3
■ Container log is not stable
2017 Software Architecture Summit
Hive - AdHoc Query
● Query is Submitted by Human
○ Through beeline/tabeau etc
● HiveServer2 is the Query Submission Service
○ via JDBC/ODBC
○ Zookeeper based HA
● Talking to Hive MetaStore Server
○ HiveServer2 is a shared resource
● LDAP for Authentication
○ Impersonation for 3rd party tool
● Hive hook to Audit all queries
2017 Software Architecture Summit
Hive - Cloud Integration
FileSinkOperator - Where S3 doesn't fit
● Kind of “2-phase commit” protocol
● Phase 1 - each task attempt will write to a unique tmp output path
hdfs://${staging_dir}/${query_id}/
_task_tmp.-ext-${stage_id}/_tmp.${task_id}_${attpt_id}
_tmp.-ext-${stage_id}/${task_id}_${attempt_id}
Other Hive Operator
FileSinkOperator
process()
closeOp()
………...
MapRed Task
Rename
Write
2017 Software Architecture Summit
Hive - Cloud Integration
FileSinkOperator - Where S3 doesn't fit
● Kind of “2-phase commit” protocol
● Phase 2 - stage controller (HiveDriver) rename query temp to final path
hdfs://${staging_dir}_${query_id}/
_task_tmp.-ext-${stage_id}/_tmp.${task_id}_${attpt_id}
_tmp.-ext-${stage_id}/
${task_id}_0
${task_id}_1
${task_id}_2
${task_id}
hdfs://${staging_dir}_${query_id}/_tmp.-ext-${stage_id}
hdfs://hive_db_name/tab_name/dt=20171026
Other Hive Stage
Output Stage
jobCloseOp()
mvFileToFinalPath
………...
HiveDriver
Invoke
2017 Software Architecture Summit
Hive - Cloud Integration
FileSinkOperator - Where S3 doesn't fit
● Fast & Atomic file/dir rename operation is the key
● Unfortunately, rename on s3 is copy & delete
2017 Software Architecture Summit
Hive - Direct Committer @ S3
Direct Output Committer
● Each task attempt writes to final output directly
○ {target folder}/{task_id}
● Consequences
○ Query Level Failure -> partial output in target folder
■ Cleanup target folder in the very beginning of each query
■ Disable INSERT INTO (using INSERT OVERWRITE Q1 UNION ALL Q2)
○ Task Level Failure -> FileAlreadyExistsException
■ File open mode CREATE -> OVERWRITE (ORC & Parquet)
○ Speculative Execution -> corrupted task output
■ Cause: outWriter.close() in killed task attempt
● S3A write to local disk first and then upload to s3 in close()
■ Solution: Disable speculative execution
2017 Software Architecture Summit
Hive - Speculative Exec @ S3
Direct Committer with Speculative Execution
● S3A Staging Committer (originated from Netflix)
○ Based on S3 multipart uploading 1
○ Decouple upload & commit, simulate write to tmp & rename to final
○ Being actively worked in Apache: HADOOP-13786
● S3A Hive Committer (originated from Pinterest, on-going)
○ Ignore the MapReduce commit protocol
○ Allow different task attempts to commit at the same time
○ Stage controller (HiveDriver) dedup multiple attempts for the same task
2017 Software Architecture Summit
Hive - Dynamic Partition @ S3
New Challenges
● Target folder path is only known at runtime
● Race condition among multiple FileSinkOperators
● Discover new dynamic partitions to update Metastore
Solutions
● Encode the ts into output file name: {target_part_folder}/{dpCtx.queryStartTs}_{task_id}
● New dynamic partition: folders that
○ Modification time > query start time
○ Contains files starting with query_start_ts
Agenda
1 Data Ingestion
2 Streaming
3 Hive
4 Presto
5 Workflow
2017 Software Architecture Summit
Presto - Overview
2017 Software Architecture Summit
Presto - Cluster & Stats
Cluster Management
● Instance type: r3.8xl
● Computing only, no local data
● Using AWS ASG
● Managed by Terraform
Presto Stats
● ~800 users
● ~10k queries daily
● 5X ~ 50X faster than Hive
● AdHoc (250 node): P90: 4min, P99: 30min
● App (150 node): P90: 2min, P99: 20min
2017 Software Architecture Summit
Presto - Feature Challenges
Concurrency Bug in ObjectInspector library
● Hive runtime is for single thread env
● Presto is highly concurrent and shared
Thrift Table Support
● Presto Hive MetaStore client doesn’t support “dynamic” table
● Deeply nested table will hang in planning phase
Schema Evolution Support
● Historical partition schema mismatch
● Small int -> big int
● Parquet table column rename
2017 Software Architecture Summit
Presto - Security Enhancement
Access Control
● Check LDAP group for proxy user for access rights
● Data Level Access is controlled by IAM role
Security Features
● Authentication
○ Integrated middle service with Impersonation
● Audit
○ Every Query is Logged
○ Plugin based on EventListener
2017 Software Architecture Summit
Presto - Cluster Operations
Separate AdHoc & Application Cluster
● Adhoc cluster is generally less stable
● App cluster only run well known queries
Bad Query Monitor
● Detected by bot based on runtime and splits #
● Kill the query and send slack message to end user
Bad Node Monitor
● Detected by bot based on query failure due to node issue stats
○ Bad node - 3 query failed on it in last 5 min
● Specially important when the cluster is scaled up & down on a daily basis
Agenda
1 Data Ingestion
2 Streaming
3 Hive
4 Presto
5 Workflow
2017 Software Architecture Summit
Workflow - Programming
Workflow Prgm Framework
● Skyline
● DataJob
● Pinflow
2017 Software Architecture Summit
Data Architecture - Manager
Workflow Manager
● Pinball
● https://github.com/pinterest/pinball
Workflow Stats
● 2000+ Workflows
● 60k+ Jobs
● ~80% Hive
● ~15% Cascading/Spark
● ~5% Python
2017 Software Architecture Summit
The World’s Catalog of Ideas
csliu@pinterest.com
Discover
Ideas
Singer Architecture
https://www.slideshare.net/DiscoverPinterest/singer-pinterests-logging-infrastructure
Merced Architecture
https://github.com/pinterest/secor
Pinball Architecture
https://github.com/pinterest/pinball

Contenu connexe

Tendances

cLoki: Like Loki but for ClickHouse
cLoki: Like Loki but for ClickHousecLoki: Like Loki but for ClickHouse
cLoki: Like Loki but for ClickHouseAltinity Ltd
 
Streaming ETL - from RDBMS to Dashboard with KSQL
Streaming ETL - from RDBMS to Dashboard with KSQLStreaming ETL - from RDBMS to Dashboard with KSQL
Streaming ETL - from RDBMS to Dashboard with KSQLBjoern Rost
 
20180503 kube con eu kubernetes metrics deep dive
20180503 kube con eu   kubernetes metrics deep dive20180503 kube con eu   kubernetes metrics deep dive
20180503 kube con eu kubernetes metrics deep diveBob Cotton
 
Spacecrafts Made Simple: How Loft Orbital Delivers Unparalleled Speed-to-Spac...
Spacecrafts Made Simple: How Loft Orbital Delivers Unparalleled Speed-to-Spac...Spacecrafts Made Simple: How Loft Orbital Delivers Unparalleled Speed-to-Spac...
Spacecrafts Made Simple: How Loft Orbital Delivers Unparalleled Speed-to-Spac...InfluxData
 
ClickHouse Paris Meetup. ClickHouse Analytical DBMS, Introduction. By Alexand...
ClickHouse Paris Meetup. ClickHouse Analytical DBMS, Introduction. By Alexand...ClickHouse Paris Meetup. ClickHouse Analytical DBMS, Introduction. By Alexand...
ClickHouse Paris Meetup. ClickHouse Analytical DBMS, Introduction. By Alexand...Altinity Ltd
 
stackconf 2020 | Ignite talk: Opensource in Advanced Research Computing, How ...
stackconf 2020 | Ignite talk: Opensource in Advanced Research Computing, How ...stackconf 2020 | Ignite talk: Opensource in Advanced Research Computing, How ...
stackconf 2020 | Ignite talk: Opensource in Advanced Research Computing, How ...NETWAYS
 
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...Altinity Ltd
 
Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark Wu
Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark WuVirtual Flink Forward 2020: A deep dive into Flink SQL - Jark Wu
Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark WuFlink Forward
 
Unify Enterprise Data Processing System Platform Level Integration of Flink a...
Unify Enterprise Data Processing System Platform Level Integration of Flink a...Unify Enterprise Data Processing System Platform Level Integration of Flink a...
Unify Enterprise Data Processing System Platform Level Integration of Flink a...Flink Forward
 
Time series denver an introduction to prometheus
Time series denver   an introduction to prometheusTime series denver   an introduction to prometheus
Time series denver an introduction to prometheusBob Cotton
 
Kafka Connect: Operational Lessons Learned from the Trenches (Elizabeth Benne...
Kafka Connect: Operational Lessons Learned from the Trenches (Elizabeth Benne...Kafka Connect: Operational Lessons Learned from the Trenches (Elizabeth Benne...
Kafka Connect: Operational Lessons Learned from the Trenches (Elizabeth Benne...confluent
 
Sprint 38 review
Sprint 38 reviewSprint 38 review
Sprint 38 reviewManageIQ
 
Kubernetes Colorado - Kubernetes metrics deep dive 10/25/2017
Kubernetes Colorado - Kubernetes metrics deep dive 10/25/2017Kubernetes Colorado - Kubernetes metrics deep dive 10/25/2017
Kubernetes Colorado - Kubernetes metrics deep dive 10/25/2017Bob Cotton
 
Presto @ Treasure Data - Presto Meetup Boston 2015
Presto @ Treasure Data - Presto Meetup Boston 2015Presto @ Treasure Data - Presto Meetup Boston 2015
Presto @ Treasure Data - Presto Meetup Boston 2015Taro L. Saito
 
Maximilian Michels - Flink and Beam
Maximilian Michels - Flink and BeamMaximilian Michels - Flink and Beam
Maximilian Michels - Flink and BeamFlink Forward
 
Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streami...
Flink Forward SF 2017:  Cliff Resnick & Seth Wiesman -   From Zero to Streami...Flink Forward SF 2017:  Cliff Resnick & Seth Wiesman -   From Zero to Streami...
Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streami...Flink Forward
 
Flink Forward San Francisco 2018: David Reniz & Dahyr Vergara - "Real-time m...
Flink Forward San Francisco 2018:  David Reniz & Dahyr Vergara - "Real-time m...Flink Forward San Francisco 2018:  David Reniz & Dahyr Vergara - "Real-time m...
Flink Forward San Francisco 2018: David Reniz & Dahyr Vergara - "Real-time m...Flink Forward
 
Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...
Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...
Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...Flink Forward
 
Matt Jarvis - Unravelling Logs: Log Processing with Logstash and Riemann
Matt Jarvis - Unravelling Logs: Log Processing with Logstash and Riemann Matt Jarvis - Unravelling Logs: Log Processing with Logstash and Riemann
Matt Jarvis - Unravelling Logs: Log Processing with Logstash and Riemann Danny Abukalam
 

Tendances (20)

cLoki: Like Loki but for ClickHouse
cLoki: Like Loki but for ClickHousecLoki: Like Loki but for ClickHouse
cLoki: Like Loki but for ClickHouse
 
Norikra Recent Updates
Norikra Recent UpdatesNorikra Recent Updates
Norikra Recent Updates
 
Streaming ETL - from RDBMS to Dashboard with KSQL
Streaming ETL - from RDBMS to Dashboard with KSQLStreaming ETL - from RDBMS to Dashboard with KSQL
Streaming ETL - from RDBMS to Dashboard with KSQL
 
20180503 kube con eu kubernetes metrics deep dive
20180503 kube con eu   kubernetes metrics deep dive20180503 kube con eu   kubernetes metrics deep dive
20180503 kube con eu kubernetes metrics deep dive
 
Spacecrafts Made Simple: How Loft Orbital Delivers Unparalleled Speed-to-Spac...
Spacecrafts Made Simple: How Loft Orbital Delivers Unparalleled Speed-to-Spac...Spacecrafts Made Simple: How Loft Orbital Delivers Unparalleled Speed-to-Spac...
Spacecrafts Made Simple: How Loft Orbital Delivers Unparalleled Speed-to-Spac...
 
ClickHouse Paris Meetup. ClickHouse Analytical DBMS, Introduction. By Alexand...
ClickHouse Paris Meetup. ClickHouse Analytical DBMS, Introduction. By Alexand...ClickHouse Paris Meetup. ClickHouse Analytical DBMS, Introduction. By Alexand...
ClickHouse Paris Meetup. ClickHouse Analytical DBMS, Introduction. By Alexand...
 
stackconf 2020 | Ignite talk: Opensource in Advanced Research Computing, How ...
stackconf 2020 | Ignite talk: Opensource in Advanced Research Computing, How ...stackconf 2020 | Ignite talk: Opensource in Advanced Research Computing, How ...
stackconf 2020 | Ignite talk: Opensource in Advanced Research Computing, How ...
 
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
 
Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark Wu
Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark WuVirtual Flink Forward 2020: A deep dive into Flink SQL - Jark Wu
Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark Wu
 
Unify Enterprise Data Processing System Platform Level Integration of Flink a...
Unify Enterprise Data Processing System Platform Level Integration of Flink a...Unify Enterprise Data Processing System Platform Level Integration of Flink a...
Unify Enterprise Data Processing System Platform Level Integration of Flink a...
 
Time series denver an introduction to prometheus
Time series denver   an introduction to prometheusTime series denver   an introduction to prometheus
Time series denver an introduction to prometheus
 
Kafka Connect: Operational Lessons Learned from the Trenches (Elizabeth Benne...
Kafka Connect: Operational Lessons Learned from the Trenches (Elizabeth Benne...Kafka Connect: Operational Lessons Learned from the Trenches (Elizabeth Benne...
Kafka Connect: Operational Lessons Learned from the Trenches (Elizabeth Benne...
 
Sprint 38 review
Sprint 38 reviewSprint 38 review
Sprint 38 review
 
Kubernetes Colorado - Kubernetes metrics deep dive 10/25/2017
Kubernetes Colorado - Kubernetes metrics deep dive 10/25/2017Kubernetes Colorado - Kubernetes metrics deep dive 10/25/2017
Kubernetes Colorado - Kubernetes metrics deep dive 10/25/2017
 
Presto @ Treasure Data - Presto Meetup Boston 2015
Presto @ Treasure Data - Presto Meetup Boston 2015Presto @ Treasure Data - Presto Meetup Boston 2015
Presto @ Treasure Data - Presto Meetup Boston 2015
 
Maximilian Michels - Flink and Beam
Maximilian Michels - Flink and BeamMaximilian Michels - Flink and Beam
Maximilian Michels - Flink and Beam
 
Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streami...
Flink Forward SF 2017:  Cliff Resnick & Seth Wiesman -   From Zero to Streami...Flink Forward SF 2017:  Cliff Resnick & Seth Wiesman -   From Zero to Streami...
Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streami...
 
Flink Forward San Francisco 2018: David Reniz & Dahyr Vergara - "Real-time m...
Flink Forward San Francisco 2018:  David Reniz & Dahyr Vergara - "Real-time m...Flink Forward San Francisco 2018:  David Reniz & Dahyr Vergara - "Real-time m...
Flink Forward San Francisco 2018: David Reniz & Dahyr Vergara - "Real-time m...
 
Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...
Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...
Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...
 
Matt Jarvis - Unravelling Logs: Log Processing with Logstash and Riemann
Matt Jarvis - Unravelling Logs: Log Processing with Logstash and Riemann Matt Jarvis - Unravelling Logs: Log Processing with Logstash and Riemann
Matt Jarvis - Unravelling Logs: Log Processing with Logstash and Riemann
 

Similaire à Scaling 100PB Data Warehouse in Cloud

Dsdt meetup 2017 11-21
Dsdt meetup 2017 11-21Dsdt meetup 2017 11-21
Dsdt meetup 2017 11-21JDA Labs MTL
 
DSDT Meetup Nov 2017
DSDT Meetup Nov 2017DSDT Meetup Nov 2017
DSDT Meetup Nov 2017DSDT_MTL
 
Flink Forward Berlin 2017: Mihail Vieru - A Materialization Engine for Data I...
Flink Forward Berlin 2017: Mihail Vieru - A Materialization Engine for Data I...Flink Forward Berlin 2017: Mihail Vieru - A Materialization Engine for Data I...
Flink Forward Berlin 2017: Mihail Vieru - A Materialization Engine for Data I...Flink Forward
 
A Day in the Life of a Druid Implementor and Druid's Roadmap
A Day in the Life of a Druid Implementor and Druid's RoadmapA Day in the Life of a Druid Implementor and Druid's Roadmap
A Day in the Life of a Druid Implementor and Druid's RoadmapItai Yaffe
 
Massively Scaled High Performance Web Services with PHP
Massively Scaled High Performance Web Services with PHPMassively Scaled High Performance Web Services with PHP
Massively Scaled High Performance Web Services with PHPDemin Yin
 
Kubernetes上で動作する機械学習モジュールの配信&管理基盤Rekcurd について
Kubernetes上で動作する機械学習モジュールの配信&管理基盤Rekcurd についてKubernetes上で動作する機械学習モジュールの配信&管理基盤Rekcurd について
Kubernetes上で動作する機械学習モジュールの配信&管理基盤Rekcurd についてLINE Corporation
 
Heroku to Kubernetes & Gihub to Gitlab success story
Heroku to Kubernetes & Gihub to Gitlab success storyHeroku to Kubernetes & Gihub to Gitlab success story
Heroku to Kubernetes & Gihub to Gitlab success storyJérémy Wimsingues
 
PyConIE 2017 Writing and deploying serverless python applications
PyConIE 2017 Writing and deploying serverless python applicationsPyConIE 2017 Writing and deploying serverless python applications
PyConIE 2017 Writing and deploying serverless python applicationsCesar Cardenas Desales
 
Kubernetes Forum Seoul 2019: Re-architecting Data Platform with Kubernetes
Kubernetes Forum Seoul 2019: Re-architecting Data Platform with KubernetesKubernetes Forum Seoul 2019: Re-architecting Data Platform with Kubernetes
Kubernetes Forum Seoul 2019: Re-architecting Data Platform with KubernetesSeungYong Oh
 
How Docker Accelerates Continuous Development at ironSource: Containers #101 ...
How Docker Accelerates Continuous Development at ironSource: Containers #101 ...How Docker Accelerates Continuous Development at ironSource: Containers #101 ...
How Docker Accelerates Continuous Development at ironSource: Containers #101 ...Brittany Ingram
 
#RADC4L16: An API-First Archives Approach at NPR
#RADC4L16: An API-First Archives Approach at NPR#RADC4L16: An API-First Archives Approach at NPR
#RADC4L16: An API-First Archives Approach at NPRCamille Salas
 
Writing and deploying serverless python applications
Writing and deploying serverless python applicationsWriting and deploying serverless python applications
Writing and deploying serverless python applicationsCesar Cardenas Desales
 
Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016aspyker
 
Scheduling a fuller house - Talk at QCon NY 2016
Scheduling a fuller house - Talk at QCon NY 2016Scheduling a fuller house - Talk at QCon NY 2016
Scheduling a fuller house - Talk at QCon NY 2016Sharma Podila
 
Improving Apache Spark Downscaling
 Improving Apache Spark Downscaling Improving Apache Spark Downscaling
Improving Apache Spark DownscalingDatabricks
 
OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...
OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...
OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...NETWAYS
 

Similaire à Scaling 100PB Data Warehouse in Cloud (20)

Dsdt meetup 2017 11-21
Dsdt meetup 2017 11-21Dsdt meetup 2017 11-21
Dsdt meetup 2017 11-21
 
DSDT Meetup Nov 2017
DSDT Meetup Nov 2017DSDT Meetup Nov 2017
DSDT Meetup Nov 2017
 
Flink Forward Berlin 2017: Mihail Vieru - A Materialization Engine for Data I...
Flink Forward Berlin 2017: Mihail Vieru - A Materialization Engine for Data I...Flink Forward Berlin 2017: Mihail Vieru - A Materialization Engine for Data I...
Flink Forward Berlin 2017: Mihail Vieru - A Materialization Engine for Data I...
 
Sprint 78
Sprint 78Sprint 78
Sprint 78
 
A Day in the Life of a Druid Implementor and Druid's Roadmap
A Day in the Life of a Druid Implementor and Druid's RoadmapA Day in the Life of a Druid Implementor and Druid's Roadmap
A Day in the Life of a Druid Implementor and Druid's Roadmap
 
Sprint 17
Sprint 17Sprint 17
Sprint 17
 
Massively Scaled High Performance Web Services with PHP
Massively Scaled High Performance Web Services with PHPMassively Scaled High Performance Web Services with PHP
Massively Scaled High Performance Web Services with PHP
 
Kubernetes上で動作する機械学習モジュールの配信&管理基盤Rekcurd について
Kubernetes上で動作する機械学習モジュールの配信&管理基盤Rekcurd についてKubernetes上で動作する機械学習モジュールの配信&管理基盤Rekcurd について
Kubernetes上で動作する機械学習モジュールの配信&管理基盤Rekcurd について
 
Heroku to Kubernetes & Gihub to Gitlab success story
Heroku to Kubernetes & Gihub to Gitlab success storyHeroku to Kubernetes & Gihub to Gitlab success story
Heroku to Kubernetes & Gihub to Gitlab success story
 
PyConIE 2017 Writing and deploying serverless python applications
PyConIE 2017 Writing and deploying serverless python applicationsPyConIE 2017 Writing and deploying serverless python applications
PyConIE 2017 Writing and deploying serverless python applications
 
Kubernetes Forum Seoul 2019: Re-architecting Data Platform with Kubernetes
Kubernetes Forum Seoul 2019: Re-architecting Data Platform with KubernetesKubernetes Forum Seoul 2019: Re-architecting Data Platform with Kubernetes
Kubernetes Forum Seoul 2019: Re-architecting Data Platform with Kubernetes
 
How Docker Accelerates Continuous Development at ironSource: Containers #101 ...
How Docker Accelerates Continuous Development at ironSource: Containers #101 ...How Docker Accelerates Continuous Development at ironSource: Containers #101 ...
How Docker Accelerates Continuous Development at ironSource: Containers #101 ...
 
Sprint 62
Sprint 62Sprint 62
Sprint 62
 
#RADC4L16: An API-First Archives Approach at NPR
#RADC4L16: An API-First Archives Approach at NPR#RADC4L16: An API-First Archives Approach at NPR
#RADC4L16: An API-First Archives Approach at NPR
 
Writing and deploying serverless python applications
Writing and deploying serverless python applicationsWriting and deploying serverless python applications
Writing and deploying serverless python applications
 
Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016
 
Scheduling a fuller house - Talk at QCon NY 2016
Scheduling a fuller house - Talk at QCon NY 2016Scheduling a fuller house - Talk at QCon NY 2016
Scheduling a fuller house - Talk at QCon NY 2016
 
Sprint 66
Sprint 66Sprint 66
Sprint 66
 
Improving Apache Spark Downscaling
 Improving Apache Spark Downscaling Improving Apache Spark Downscaling
Improving Apache Spark Downscaling
 
OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...
OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...
OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...
 

Dernier

Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 

Dernier (20)

Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 

Scaling 100PB Data Warehouse in Cloud

  • 1. 2017 Software Architecture Summit Scaling100PBDataWarehousein Cloud Changshu LIU Software Engineer @ Pinterest 10/2017
  • 2. Pinterest : The World’s Catalog of Ideas
  • 3. 2017 Software Architecture Summit Scale@Pinterest Biz Scale • 200M+ MAUs • 120B+ Pins • 3B+ Boards Data Scale • 140+ PB @ S3 • 6000+ Hive/Hadoop nodes • 400+ Presto nodes
  • 4. 2017 Software Architecture Summit Data@Pinterest
  • 5. Agenda 1 Data Ingestion 2 Streaming 3 Hive 4 Presto 5 Workflow
  • 6. 2017 Software Architecture Summit Data Ingestion - Overview
  • 7. 2017 Software Architecture Summit Data Ingestion - Logging Singer - Logging Agent for Kafka - Decouple logging and kafka - Failure Isolation - Buffering, Pooling Merced - Log Persistence and Transportation Service - Kafka -> Storage Destinations - S3, HBase and SQS - Auto rebalancing, failure handling https://hortonworks.com/apache/kafka/
  • 8. 2017 Software Architecture Summit Data Ingestion - Full DB Dump Tracker ● Daily db dump: Hadoop Streaming with mysqldump ● Hadoop node <-> MySQL node traffic pains ○ Balancing & Throttling ● Performance pains ○ Remote mysqldump ○ Python streaming
  • 9. 2017 Software Architecture Summit Data Ingestion - Inc DB Dump WaterMill ● Maxwell tails MySQL bin logs and writes to Kafka ○ Tail bin logs and write to kafka as json ● Mini-batch job to apply updates on base version ○ Distribute records by db shard id ○ Small (update) table join Big (base) table ● Alias table points to the latest snapshot
  • 10. 2017 Software Architecture Summit Data Ingestion - Stats Kafka Stats ● ~150B Messages/Day ● ~400T Bytes/Day ● ~1000 Kafka Topics ● ~1400 Kafka Brokers Doctor Kafka https://github.com/pinterest/doctorkafka DB Dump Stats ● ~80 Table/Day ● ~100T Bytes/Day
  • 11. 2017 Software Architecture Summit Data Processing Overview
  • 12. Agenda 1 Data Ingestion 2 Streaming 3 Hive 4 Presto 5 Workflow
  • 13. 2017 Software Architecture Summit Streaming - Overview
  • 14. 2017 Software Architecture Summit Streaming - Voracity ● Schema Validation ○ Filter out malformed records ● Late Arrival Event Handling ○ Move late event to proper bucket ● Parquet Conversion ○ Json/Thrift -> Parquet
  • 15. Agenda 1 Data Ingestion 2 Streaming 3 Hive 4 Presto 5 Workflow
  • 16. 2017 Software Architecture Summit Hive - Overview AdHoc Query Scheduled Query
  • 17. 2017 Software Architecture Summit Hive - Scheduled Query INSERT OVERWRITE DIRECTORY 's3://hive_query_output/cluster_name/20171019/application_1508263025827_2301/0' ROW FORMAT DELIMITED FIELDS TERMINATED BY '001' COLLECTION ITEMS TERMINATED BY '002' MAP KEYS TERMINATED BY '003' LINES TERMINATED BY 'n' STORED AS TEXTFILE SELECT country, count(*) FROM db_users GROUP BY country SELECT country, count(*) FROM db_users GROUP BY country Rewrite ● Query is submitted by System ● Based on Hive CLI ● Talking to MetaStore (MySQL) ● YARN REST API ○ DistShell: AppMaster + Worker ■ Hive CLI as pure shell command ○ Get stdout/stderr from NM REST API ○ Rewrite select query to insert to s3 ■ Container log is not stable
  • 18. 2017 Software Architecture Summit Hive - AdHoc Query ● Query is Submitted by Human ○ Through beeline/tabeau etc ● HiveServer2 is the Query Submission Service ○ via JDBC/ODBC ○ Zookeeper based HA ● Talking to Hive MetaStore Server ○ HiveServer2 is a shared resource ● LDAP for Authentication ○ Impersonation for 3rd party tool ● Hive hook to Audit all queries
  • 19. 2017 Software Architecture Summit Hive - Cloud Integration FileSinkOperator - Where S3 doesn't fit ● Kind of “2-phase commit” protocol ● Phase 1 - each task attempt will write to a unique tmp output path hdfs://${staging_dir}/${query_id}/ _task_tmp.-ext-${stage_id}/_tmp.${task_id}_${attpt_id} _tmp.-ext-${stage_id}/${task_id}_${attempt_id} Other Hive Operator FileSinkOperator process() closeOp() ………... MapRed Task Rename Write
  • 20. 2017 Software Architecture Summit Hive - Cloud Integration FileSinkOperator - Where S3 doesn't fit ● Kind of “2-phase commit” protocol ● Phase 2 - stage controller (HiveDriver) rename query temp to final path hdfs://${staging_dir}_${query_id}/ _task_tmp.-ext-${stage_id}/_tmp.${task_id}_${attpt_id} _tmp.-ext-${stage_id}/ ${task_id}_0 ${task_id}_1 ${task_id}_2 ${task_id} hdfs://${staging_dir}_${query_id}/_tmp.-ext-${stage_id} hdfs://hive_db_name/tab_name/dt=20171026 Other Hive Stage Output Stage jobCloseOp() mvFileToFinalPath ………... HiveDriver Invoke
  • 21. 2017 Software Architecture Summit Hive - Cloud Integration FileSinkOperator - Where S3 doesn't fit ● Fast & Atomic file/dir rename operation is the key ● Unfortunately, rename on s3 is copy & delete
  • 22. 2017 Software Architecture Summit Hive - Direct Committer @ S3 Direct Output Committer ● Each task attempt writes to final output directly ○ {target folder}/{task_id} ● Consequences ○ Query Level Failure -> partial output in target folder ■ Cleanup target folder in the very beginning of each query ■ Disable INSERT INTO (using INSERT OVERWRITE Q1 UNION ALL Q2) ○ Task Level Failure -> FileAlreadyExistsException ■ File open mode CREATE -> OVERWRITE (ORC & Parquet) ○ Speculative Execution -> corrupted task output ■ Cause: outWriter.close() in killed task attempt ● S3A write to local disk first and then upload to s3 in close() ■ Solution: Disable speculative execution
  • 23. 2017 Software Architecture Summit Hive - Speculative Exec @ S3 Direct Committer with Speculative Execution ● S3A Staging Committer (originated from Netflix) ○ Based on S3 multipart uploading 1 ○ Decouple upload & commit, simulate write to tmp & rename to final ○ Being actively worked in Apache: HADOOP-13786 ● S3A Hive Committer (originated from Pinterest, on-going) ○ Ignore the MapReduce commit protocol ○ Allow different task attempts to commit at the same time ○ Stage controller (HiveDriver) dedup multiple attempts for the same task
  • 24. 2017 Software Architecture Summit Hive - Dynamic Partition @ S3 New Challenges ● Target folder path is only known at runtime ● Race condition among multiple FileSinkOperators ● Discover new dynamic partitions to update Metastore Solutions ● Encode the ts into output file name: {target_part_folder}/{dpCtx.queryStartTs}_{task_id} ● New dynamic partition: folders that ○ Modification time > query start time ○ Contains files starting with query_start_ts
  • 25. Agenda 1 Data Ingestion 2 Streaming 3 Hive 4 Presto 5 Workflow
  • 26. 2017 Software Architecture Summit Presto - Overview
  • 27. 2017 Software Architecture Summit Presto - Cluster & Stats Cluster Management ● Instance type: r3.8xl ● Computing only, no local data ● Using AWS ASG ● Managed by Terraform Presto Stats ● ~800 users ● ~10k queries daily ● 5X ~ 50X faster than Hive ● AdHoc (250 node): P90: 4min, P99: 30min ● App (150 node): P90: 2min, P99: 20min
  • 28. 2017 Software Architecture Summit Presto - Feature Challenges Concurrency Bug in ObjectInspector library ● Hive runtime is for single thread env ● Presto is highly concurrent and shared Thrift Table Support ● Presto Hive MetaStore client doesn’t support “dynamic” table ● Deeply nested table will hang in planning phase Schema Evolution Support ● Historical partition schema mismatch ● Small int -> big int ● Parquet table column rename
  • 29. 2017 Software Architecture Summit Presto - Security Enhancement Access Control ● Check LDAP group for proxy user for access rights ● Data Level Access is controlled by IAM role Security Features ● Authentication ○ Integrated middle service with Impersonation ● Audit ○ Every Query is Logged ○ Plugin based on EventListener
  • 30. 2017 Software Architecture Summit Presto - Cluster Operations Separate AdHoc & Application Cluster ● Adhoc cluster is generally less stable ● App cluster only run well known queries Bad Query Monitor ● Detected by bot based on runtime and splits # ● Kill the query and send slack message to end user Bad Node Monitor ● Detected by bot based on query failure due to node issue stats ○ Bad node - 3 query failed on it in last 5 min ● Specially important when the cluster is scaled up & down on a daily basis
  • 31. Agenda 1 Data Ingestion 2 Streaming 3 Hive 4 Presto 5 Workflow
  • 32. 2017 Software Architecture Summit Workflow - Programming Workflow Prgm Framework ● Skyline ● DataJob ● Pinflow
  • 33. 2017 Software Architecture Summit Data Architecture - Manager Workflow Manager ● Pinball ● https://github.com/pinterest/pinball Workflow Stats ● 2000+ Workflows ● 60k+ Jobs ● ~80% Hive ● ~15% Cascading/Spark ● ~5% Python
  • 34. 2017 Software Architecture Summit The World’s Catalog of Ideas csliu@pinterest.com