SlideShare une entreprise Scribd logo
1  sur  53
Télécharger pour lire hors ligne
Scaling Apache Spark at
Facebook
Sameer Agarwal & Ankit Agarwal
Spark Summit | San Francisco | 24th April 2019
Sameer Agarwal
- Software Engineer at Facebook (Data Warehouse Team)
- Apache Spark Committer (Spark Core/SQL)
- Previously at Databricks and UC Berkeley
Ankit Agarwal
- Production Engineering Manager at Facebook (Data Warehouse Team)
- Data Infrastructure Team at Facebook since 2012
- Previously worked on the search team at Yahoo!
About Us
1. Spark at Facebook
2. Hardware Trends: A tale of two bottlenecks
3. Evolving the Core Engine
- History Based Tuning
- Join Optimizations
4. Our Users and their Use-cases
5. The Road Ahead
Agenda
1. Spark at Facebook
2. Hardware Trends: A tale of two bottlenecks
3. Evolving the Core Engine
- History Based Tuning
- Join Optimizations
4. Our Users and their Use-cases
5. The Road Ahead
Agenda
Data at Facebook
2.7 Billion MAU
2 Billion DAU
Source: Facebook Q4 2018 earnings call transcript
2015
Small Scale
Experiments
2016
Few Pipelines in
Production
2017
Running 60TB+
shuffle pipelines
2018
Full-production
deployment
Successor to Apache
Hive at Facebook
2019
Scaling Spark
Largest Compute
Engine at Facebook
by CPU
The Journey
1. Spark at Facebook
2. Hardware Trends: A tale of two bottlenecks
3. Evolving the Core Engine
- History Based Tuning
- Join Optimizations
4. Our Users and their Use-cases
5. The Road Ahead
Agenda
Hardware Trends
CPU, DRAM, and Disk
Hardware Trends
CPU, DRAM, and Disk
1. The industry is optimizing for
throughput by adding more cores
2. To optimize performance/watt,
next generation processors will have
more cores that run at lower
frequency
Hardware Trends
CPU, DRAM, and Disk
1. The price of DRAM continued to rise
throughout 2016-2018 and has
started fluctuating this year
2. Need to reduce our over-
dependence on DRAM
Hardware Trends
CPU, DRAM, and Disk
1. Disk sizes continue to increase
but the number of random
accesses per second aren’t
increasing
2. IOPS becomes a bottleneck
What does this mean for Spark?
1. Optimize Spark for increasing core-memory ratio
2. Run Spark on disaggregated compute/storage clusters
- Use server types optimized for compute and storage
- Scale/upgrade clusters independently over time depending
on whether CPU or IOPS was a bottleneck
3. Scale extremely diverse workloads (SQL, ML etc.) on Spark
over clusters of tens of thousands of heterogenous
machines
Spark Architecture at Facebook
Compute Cluster Storage Cluster
Distributed FS instance #1
Distributed FS instance #2
Executor #1
Executors #2
Distributed FS instance #3
Spark Architecture at Facebook
Compute Cluster Storage Cluster
Distributed FS instance #1
Distributed FS instance #2
Spill,
Cache,
Shuffle
Executor #1
Executors #2
Distributed FS instance #3
Spark Architecture at Facebook
Compute Cluster Storage Cluster
Distributed FS instance #1
Distributed FS instance #2
Spill,
Cache,
Shuffle
Tangram Scheduler
Executor #1
Executors #2
Distributed FS instance #3
Heterogenous Hardware
(purchased over 0-5 years)
Spark Architecture at Facebook
Compute Cluster Storage Cluster
Distributed FS instance #1
Distributed FS instance #2
Spill,
Cache,
Shuffle
Tangram Scheduler
Executor #1
Executors #2
Distributed FS instance #3
Heterogenous Hardware
(purchased over 0-5 years)
Brian Cho and Dmitry Borovsky, Cosco: An Efficient Facebook-Scale Shuffle Service
Today at 4:30PM (Developer Track)
Rui Jian and Hao Lin, Tangram: Distributed Scheduling for Spark at Facebook
Tomorrow at 11:50AM (Developer Track)
1. Spark at Facebook
2. Hardware Trends: A tale of two bottlenecks
3. Evolving the Core Engine
- History Based Tuning
- Join Optimizations
4. Our Users and their Use-cases
5. The Road Ahead
Agenda
Contributed 100+
patches upstream
History-Based Tuning: MotivationClusterMemoryUtilization
1 week
max (80-100%)
p95 (55-70%)
p50 (10-60%)
History-Based Tuning: MotivationClusterMemoryUtilization
1 week
max (80-100%)
p95 (55-70%)
p50 (10-60%)
One-size-fits-all configs results in under-utilization of resources
History-Based Tuning: MotivationPercentageofSparkTasks(CDF)
Peak Execution Memory Bytes
75% of Spark tasks use less than
600 MB of peak execution memory
History-Based Tuning: MotivationPercentageofSparkTasks(CDF)
Peak Execution Memory Bytes
75% of Spark tasks use less than
600 MB of peak execution memory
Individual resource requirements for each Spark task has a huge variance
History-Based Tuning
1. Need to tune Spark on a per-job or a per-stage basis
2. Leverage historical characteristics of the job to tune resources:
• Peak executor memory and spill sizes to tune executor off-heap
memory
• Shuffle size to optionally not insert partial aggregates in the query plan
• Predicting the number of shuffle partitions (job level and stage level)
History-Based Tuning
New
Query
Query Plan
Template
InsertIntoHiveTable [partitions: ds,country]
+- *Project [cast(key as int) AS key, value]
+- *HiveTableScan (db.test) [col: key,value] [part: ds]
History-Based Tuning
New
Query
Historical
Job Runs
Query Plan
Template
Apply
Config
Overrides
Apply
Conservative
Defaults
No Regressions/Failures
since past N days
Regressions/Failures
since past N days
Config
Override
Rules
1. Broadcast Join: Broadcast small table to all nodes, stream
the larger table; skew resistant
2. Shuffle-Hash Join: Shuffle both tables, create a hashmap
with smaller table and stream the larger table
3. Sort-Merge Join: Shuffle and sort both tables, buffer one
side and stream the other side
Joins in Spark
1. Bucketing is a way to shuffle (and optionally sort) output data
based on certain columns of table
2. Ideal for write-once, read-many datasets
3. Variant of Sort Merge Join in Spark; overrides
outputPartitioning and outputOrdering for
HiveTableScanExec and stitches partitioning/ ordering
metadata throughout the query plan
Sort-Merge-Bucket (SMB) Join
SPARK-19256
A hybrid join algorithm where-in each task starts off by
executing a shuffle-hash join. In the process of execution,
should the hash table exceed a certain size (and OOM),
it automatically reconstructs/sorts the iterators and falls
back to a sort merge join
Dynamic Join
SPARK- 21505
Skew Join
A hybrid join algorithm that processes skewed keys via a
broadcast join and non-skewed keys via a shuffle-hash
or sort-merge join
SELECT /*+ SKEWED_ON(a.userid='10001') */ a.userid
FROM table_A a INNER JOIN table_B b
ON a.userid = b.userid
1. Spark at Facebook
2. Hardware Trends: A tale of two bottlenecks
3. Evolving the Core Engine
- History Based Tuning
- Join Optimizations
4. Our Users and their Use-cases
5. The Road Ahead
Agenda
Data Scientists (10%)
Data Engineers (15%)
Software Engineers (60%)
Others (15%)
Who uses Spark?
Error Classification
• System v/s User
• Retriability
• Root Cause
Showing actionable error messages
Automatic Error Classification
aka Failure Attribution
How Spark is used?
Pure SQL (54%)
Pure SQL (72%)
UDF & Transforms
(45%)
UDF & Transforms
(20%)
DataFrames (1%)
DataFrames (8%)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Count CPU
ChartTitle
Query Count CPU
Diversity of Workload
Data Driven Decisions
Standardized Testing
Change X Standardized Tests Log Metrics
Evaluate Results
Data Driven Decisions
Shadow Testing
Change X
Create a tag
Shadow Testing Log Metrics
Evaluate Results
Tag based selection
• New Features
• Regular Releases
• Configuration Updates
• Hardware Testing
Where do we use it?
Workload Prioritization
Spark Cluster
Team 1 Team 2
BackfillPipelinesFastlane Interactive
(FIFO) (User Fair Share)(DRF)
(FIFO) (FIFO)
(User Fair Share)
60% 40%
• Hard limits on config values
• Capacity Quotas (Storage and Compute)
• Strict resource limits (containerization)
Defensive Deployment
Guardrails for us (and users)
Resource Limits
Cgroup v2
Spark Executor
/cgroup2/task_container/exec1
Memory Oversubscription
Finding the balance
/cgroup2/task_container/
exec1
exec2
exec3
exec4
40 GB
12 GB
12 GB
12 GB
12 GB
memory.max
A tale of two resources
CPU
Memory
ResourceUtilization%
Mar 2nd – Mar 10th 2019
A tale of THREE resources
Or my love-hate relationship with cgroups
CPU
Memory
Disk IO
ResourceUtilization%
Mar 2nd – Mar 10th 2019
Memory Oversubscription
Finding the balance
/cgroup2/task_container/
exec1
exec2
exec3
exec4
40 GB
12 GB
12 GB
12 GB
12 GB
memory.max
Memory Oversubscription
The full picture
/cgroup2/task_container/
exec1
exec2
exec3
exec4
40 GB
12 GB
12 GB
12 GB
12 GB
memory.max
10 GB
10 GB
10 GB
10 GB
memory.high
Memory.what?
memory.high is the memory usage throttle limit. This is the main mechanism to control
a cgroup’s memory use. If a cgroup's memory use goes over the high boundary specified
here, the cgroup’s processes are throttled and put under heavy reclaim pressure. The
default is max, meaning there is no limit.
memory.max is the memory usage hard limit, acting as the final protection mechanism:
If a cgroup's memory usage reaches this limit and can't be reduced, the system OOM
killer is invoked on the cgroup.
Memory Pressure?
Memory Pressure
memory.max
memory.high
Memory Oversubscription
The full picture
/cgroup2/task_container/
exec1
exec2
exec3
exec4
40 GB
12 GB
12 GB
12 GB
12 GB
memory.max
4 GB
4 GB
4 GB
4 GB
memory.high
Memory Pressure?
Memory Pressure
memory.max
memory.high
Thrashing
• Our cgroup configuration was wrong
• History Based scheduling
So… What happened?
• Cgroups configuration can be tricky
• Find the right balance between efficiency and reliability
• Bonus: Better resource control on IO
Takeaways
1. Spark at Facebook
2. Hardware Trends: A tale of two bottlenecks
3. Evolving the Core Engine
- History Based Tuning
- Join Optimizations
4. Our Users and their Use-cases
5. The Road Ahead
Agenda
• Scaling Spark 10X
• Redefining “Warehouse”
• Beyond SQL
The Road Ahead
INFRASTRUCTURE
Sameer Agarwal: sag@fb.com
Ankit Agarwal: ankitag@fb.com

Contenu connexe

Tendances

The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/AvroThe Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
Databricks
 

Tendances (20)

Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital KediaTuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
 
Adaptive Query Execution: Speeding Up Spark SQL at Runtime
Adaptive Query Execution: Speeding Up Spark SQL at RuntimeAdaptive Query Execution: Speeding Up Spark SQL at Runtime
Adaptive Query Execution: Speeding Up Spark SQL at Runtime
 
Optimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL JoinsOptimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL Joins
 
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/AvroThe Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
 
Apache Spark Core—Deep Dive—Proper Optimization
Apache Spark Core—Deep Dive—Proper OptimizationApache Spark Core—Deep Dive—Proper Optimization
Apache Spark Core—Deep Dive—Proper Optimization
 
Apache Spark Introduction and Resilient Distributed Dataset basics and deep dive
Apache Spark Introduction and Resilient Distributed Dataset basics and deep diveApache Spark Introduction and Resilient Distributed Dataset basics and deep dive
Apache Spark Introduction and Resilient Distributed Dataset basics and deep dive
 
Fine Tuning and Enhancing Performance of Apache Spark Jobs
Fine Tuning and Enhancing Performance of Apache Spark JobsFine Tuning and Enhancing Performance of Apache Spark Jobs
Fine Tuning and Enhancing Performance of Apache Spark Jobs
 
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...
 
Tuning and Debugging in Apache Spark
Tuning and Debugging in Apache SparkTuning and Debugging in Apache Spark
Tuning and Debugging in Apache Spark
 
Processing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeekProcessing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeek
 
Apache Spark on K8S Best Practice and Performance in the Cloud
Apache Spark on K8S Best Practice and Performance in the CloudApache Spark on K8S Best Practice and Performance in the Cloud
Apache Spark on K8S Best Practice and Performance in the Cloud
 
Performance Troubleshooting Using Apache Spark Metrics
Performance Troubleshooting Using Apache Spark MetricsPerformance Troubleshooting Using Apache Spark Metrics
Performance Troubleshooting Using Apache Spark Metrics
 
Physical Plans in Spark SQL
Physical Plans in Spark SQLPhysical Plans in Spark SQL
Physical Plans in Spark SQL
 
Memory Management in Apache Spark
Memory Management in Apache SparkMemory Management in Apache Spark
Memory Management in Apache Spark
 
Advanced Apache Spark Meetup Project Tungsten Nov 12 2015
Advanced Apache Spark Meetup Project Tungsten Nov 12 2015Advanced Apache Spark Meetup Project Tungsten Nov 12 2015
Advanced Apache Spark Meetup Project Tungsten Nov 12 2015
 
Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...
Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...
Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metal-(Josh Ro...
 
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
 
Running Apache Spark on Kubernetes: Best Practices and Pitfalls
Running Apache Spark on Kubernetes: Best Practices and PitfallsRunning Apache Spark on Kubernetes: Best Practices and Pitfalls
Running Apache Spark on Kubernetes: Best Practices and Pitfalls
 
Apache Spark Core
Apache Spark CoreApache Spark Core
Apache Spark Core
 

Similaire à Scaling Apache Spark at Facebook

Metrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye Zhou
Metrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye ZhouMetrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye Zhou
Metrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye Zhou
Databricks
 
Highlights and Challenges from Running Spark on Mesos in Production by Morri ...
Highlights and Challenges from Running Spark on Mesos in Production by Morri ...Highlights and Challenges from Running Spark on Mesos in Production by Morri ...
Highlights and Challenges from Running Spark on Mesos in Production by Morri ...
Spark Summit
 
Metrics-driven tuning of Apache Spark at scale
Metrics-driven tuning of Apache Spark at scaleMetrics-driven tuning of Apache Spark at scale
Metrics-driven tuning of Apache Spark at scale
DataWorks Summit
 
Apache Spark for RDBMS Practitioners: How I Learned to Stop Worrying and Lov...
 Apache Spark for RDBMS Practitioners: How I Learned to Stop Worrying and Lov... Apache Spark for RDBMS Practitioners: How I Learned to Stop Worrying and Lov...
Apache Spark for RDBMS Practitioners: How I Learned to Stop Worrying and Lov...
Databricks
 

Similaire à Scaling Apache Spark at Facebook (20)

Migrating Complex Data Aggregation from Hadoop to Spark-(Ashish Singh andPune...
Migrating Complex Data Aggregation from Hadoop to Spark-(Ashish Singh andPune...Migrating Complex Data Aggregation from Hadoop to Spark-(Ashish Singh andPune...
Migrating Complex Data Aggregation from Hadoop to Spark-(Ashish Singh andPune...
 
Spark & Yarn better together 1.2
Spark & Yarn better together 1.2Spark & Yarn better together 1.2
Spark & Yarn better together 1.2
 
Metrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye Zhou
Metrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye ZhouMetrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye Zhou
Metrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye Zhou
 
Highlights and Challenges from Running Spark on Mesos in Production by Morri ...
Highlights and Challenges from Running Spark on Mesos in Production by Morri ...Highlights and Challenges from Running Spark on Mesos in Production by Morri ...
Highlights and Challenges from Running Spark on Mesos in Production by Morri ...
 
The Impact of Columnar File Formats on SQL-on-Hadoop Engine Performance: A St...
The Impact of Columnar File Formats on SQL-on-Hadoop Engine Performance: A St...The Impact of Columnar File Formats on SQL-on-Hadoop Engine Performance: A St...
The Impact of Columnar File Formats on SQL-on-Hadoop Engine Performance: A St...
 
Introduction to Apache Spark 2.0
Introduction to Apache Spark 2.0Introduction to Apache Spark 2.0
Introduction to Apache Spark 2.0
 
Spark1
Spark1Spark1
Spark1
 
Spark Summit EU talk by Luca Canali
Spark Summit EU talk by Luca CanaliSpark Summit EU talk by Luca Canali
Spark Summit EU talk by Luca Canali
 
Metrics-driven tuning of Apache Spark at scale
Metrics-driven tuning of Apache Spark at scaleMetrics-driven tuning of Apache Spark at scale
Metrics-driven tuning of Apache Spark at scale
 
What no one tells you about writing a streaming app
What no one tells you about writing a streaming appWhat no one tells you about writing a streaming app
What no one tells you about writing a streaming app
 
What No One Tells You About Writing a Streaming App: Spark Summit East talk b...
What No One Tells You About Writing a Streaming App: Spark Summit East talk b...What No One Tells You About Writing a Streaming App: Spark Summit East talk b...
What No One Tells You About Writing a Streaming App: Spark Summit East talk b...
 
Apache Spark Performance is too hard. Let's make it easier
Apache Spark Performance is too hard. Let's make it easierApache Spark Performance is too hard. Let's make it easier
Apache Spark Performance is too hard. Let's make it easier
 
Getting Started with Apache Spark on Kubernetes
Getting Started with Apache Spark on KubernetesGetting Started with Apache Spark on Kubernetes
Getting Started with Apache Spark on Kubernetes
 
Native support of Prometheus monitoring in Apache Spark 3
Native support of Prometheus monitoring in Apache Spark 3Native support of Prometheus monitoring in Apache Spark 3
Native support of Prometheus monitoring in Apache Spark 3
 
Tachyon-2014-11-21-amp-camp5
Tachyon-2014-11-21-amp-camp5Tachyon-2014-11-21-amp-camp5
Tachyon-2014-11-21-amp-camp5
 
Apache Spark for RDBMS Practitioners: How I Learned to Stop Worrying and Lov...
 Apache Spark for RDBMS Practitioners: How I Learned to Stop Worrying and Lov... Apache Spark for RDBMS Practitioners: How I Learned to Stop Worrying and Lov...
Apache Spark for RDBMS Practitioners: How I Learned to Stop Worrying and Lov...
 
Thing you didn't know you could do in Spark
Thing you didn't know you could do in SparkThing you didn't know you could do in Spark
Thing you didn't know you could do in Spark
 
Apache Spark 3.0: Overview of What’s New and Why Care
Apache Spark 3.0: Overview of What’s New and Why CareApache Spark 3.0: Overview of What’s New and Why Care
Apache Spark 3.0: Overview of What’s New and Why Care
 
Headaches and Breakthroughs in Building Continuous Applications
Headaches and Breakthroughs in Building Continuous ApplicationsHeadaches and Breakthroughs in Building Continuous Applications
Headaches and Breakthroughs in Building Continuous Applications
 
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...
 

Plus de Databricks

Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
Databricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 

Plus de Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 

Dernier

Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
gajnagarg
 
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
HyderabadDolls
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Klinik kandungan
 
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
HyderabadDolls
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
gajnagarg
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
nirzagarg
 
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
HyderabadDolls
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
ahmedjiabur940
 

Dernier (20)

Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
 
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
 
Introduction to Statistics Presentation.pptx
Introduction to Statistics Presentation.pptxIntroduction to Statistics Presentation.pptx
Introduction to Statistics Presentation.pptx
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
 
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
 
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
 

Scaling Apache Spark at Facebook

  • 1. Scaling Apache Spark at Facebook Sameer Agarwal & Ankit Agarwal Spark Summit | San Francisco | 24th April 2019
  • 2. Sameer Agarwal - Software Engineer at Facebook (Data Warehouse Team) - Apache Spark Committer (Spark Core/SQL) - Previously at Databricks and UC Berkeley Ankit Agarwal - Production Engineering Manager at Facebook (Data Warehouse Team) - Data Infrastructure Team at Facebook since 2012 - Previously worked on the search team at Yahoo! About Us
  • 3. 1. Spark at Facebook 2. Hardware Trends: A tale of two bottlenecks 3. Evolving the Core Engine - History Based Tuning - Join Optimizations 4. Our Users and their Use-cases 5. The Road Ahead Agenda
  • 4. 1. Spark at Facebook 2. Hardware Trends: A tale of two bottlenecks 3. Evolving the Core Engine - History Based Tuning - Join Optimizations 4. Our Users and their Use-cases 5. The Road Ahead Agenda
  • 6. 2.7 Billion MAU 2 Billion DAU Source: Facebook Q4 2018 earnings call transcript
  • 7. 2015 Small Scale Experiments 2016 Few Pipelines in Production 2017 Running 60TB+ shuffle pipelines 2018 Full-production deployment Successor to Apache Hive at Facebook 2019 Scaling Spark Largest Compute Engine at Facebook by CPU The Journey
  • 8. 1. Spark at Facebook 2. Hardware Trends: A tale of two bottlenecks 3. Evolving the Core Engine - History Based Tuning - Join Optimizations 4. Our Users and their Use-cases 5. The Road Ahead Agenda
  • 10. Hardware Trends CPU, DRAM, and Disk 1. The industry is optimizing for throughput by adding more cores 2. To optimize performance/watt, next generation processors will have more cores that run at lower frequency
  • 11. Hardware Trends CPU, DRAM, and Disk 1. The price of DRAM continued to rise throughout 2016-2018 and has started fluctuating this year 2. Need to reduce our over- dependence on DRAM
  • 12. Hardware Trends CPU, DRAM, and Disk 1. Disk sizes continue to increase but the number of random accesses per second aren’t increasing 2. IOPS becomes a bottleneck
  • 13. What does this mean for Spark? 1. Optimize Spark for increasing core-memory ratio 2. Run Spark on disaggregated compute/storage clusters - Use server types optimized for compute and storage - Scale/upgrade clusters independently over time depending on whether CPU or IOPS was a bottleneck 3. Scale extremely diverse workloads (SQL, ML etc.) on Spark over clusters of tens of thousands of heterogenous machines
  • 14. Spark Architecture at Facebook Compute Cluster Storage Cluster Distributed FS instance #1 Distributed FS instance #2 Executor #1 Executors #2 Distributed FS instance #3
  • 15. Spark Architecture at Facebook Compute Cluster Storage Cluster Distributed FS instance #1 Distributed FS instance #2 Spill, Cache, Shuffle Executor #1 Executors #2 Distributed FS instance #3
  • 16. Spark Architecture at Facebook Compute Cluster Storage Cluster Distributed FS instance #1 Distributed FS instance #2 Spill, Cache, Shuffle Tangram Scheduler Executor #1 Executors #2 Distributed FS instance #3 Heterogenous Hardware (purchased over 0-5 years)
  • 17. Spark Architecture at Facebook Compute Cluster Storage Cluster Distributed FS instance #1 Distributed FS instance #2 Spill, Cache, Shuffle Tangram Scheduler Executor #1 Executors #2 Distributed FS instance #3 Heterogenous Hardware (purchased over 0-5 years) Brian Cho and Dmitry Borovsky, Cosco: An Efficient Facebook-Scale Shuffle Service Today at 4:30PM (Developer Track) Rui Jian and Hao Lin, Tangram: Distributed Scheduling for Spark at Facebook Tomorrow at 11:50AM (Developer Track)
  • 18. 1. Spark at Facebook 2. Hardware Trends: A tale of two bottlenecks 3. Evolving the Core Engine - History Based Tuning - Join Optimizations 4. Our Users and their Use-cases 5. The Road Ahead Agenda Contributed 100+ patches upstream
  • 19. History-Based Tuning: MotivationClusterMemoryUtilization 1 week max (80-100%) p95 (55-70%) p50 (10-60%)
  • 20. History-Based Tuning: MotivationClusterMemoryUtilization 1 week max (80-100%) p95 (55-70%) p50 (10-60%) One-size-fits-all configs results in under-utilization of resources
  • 21. History-Based Tuning: MotivationPercentageofSparkTasks(CDF) Peak Execution Memory Bytes 75% of Spark tasks use less than 600 MB of peak execution memory
  • 22. History-Based Tuning: MotivationPercentageofSparkTasks(CDF) Peak Execution Memory Bytes 75% of Spark tasks use less than 600 MB of peak execution memory Individual resource requirements for each Spark task has a huge variance
  • 23. History-Based Tuning 1. Need to tune Spark on a per-job or a per-stage basis 2. Leverage historical characteristics of the job to tune resources: • Peak executor memory and spill sizes to tune executor off-heap memory • Shuffle size to optionally not insert partial aggregates in the query plan • Predicting the number of shuffle partitions (job level and stage level)
  • 24. History-Based Tuning New Query Query Plan Template InsertIntoHiveTable [partitions: ds,country] +- *Project [cast(key as int) AS key, value] +- *HiveTableScan (db.test) [col: key,value] [part: ds]
  • 25. History-Based Tuning New Query Historical Job Runs Query Plan Template Apply Config Overrides Apply Conservative Defaults No Regressions/Failures since past N days Regressions/Failures since past N days Config Override Rules
  • 26. 1. Broadcast Join: Broadcast small table to all nodes, stream the larger table; skew resistant 2. Shuffle-Hash Join: Shuffle both tables, create a hashmap with smaller table and stream the larger table 3. Sort-Merge Join: Shuffle and sort both tables, buffer one side and stream the other side Joins in Spark
  • 27. 1. Bucketing is a way to shuffle (and optionally sort) output data based on certain columns of table 2. Ideal for write-once, read-many datasets 3. Variant of Sort Merge Join in Spark; overrides outputPartitioning and outputOrdering for HiveTableScanExec and stitches partitioning/ ordering metadata throughout the query plan Sort-Merge-Bucket (SMB) Join SPARK-19256
  • 28. A hybrid join algorithm where-in each task starts off by executing a shuffle-hash join. In the process of execution, should the hash table exceed a certain size (and OOM), it automatically reconstructs/sorts the iterators and falls back to a sort merge join Dynamic Join SPARK- 21505
  • 29. Skew Join A hybrid join algorithm that processes skewed keys via a broadcast join and non-skewed keys via a shuffle-hash or sort-merge join SELECT /*+ SKEWED_ON(a.userid='10001') */ a.userid FROM table_A a INNER JOIN table_B b ON a.userid = b.userid
  • 30. 1. Spark at Facebook 2. Hardware Trends: A tale of two bottlenecks 3. Evolving the Core Engine - History Based Tuning - Join Optimizations 4. Our Users and their Use-cases 5. The Road Ahead Agenda
  • 31. Data Scientists (10%) Data Engineers (15%) Software Engineers (60%) Others (15%) Who uses Spark?
  • 32. Error Classification • System v/s User • Retriability • Root Cause Showing actionable error messages Automatic Error Classification aka Failure Attribution
  • 33. How Spark is used? Pure SQL (54%) Pure SQL (72%) UDF & Transforms (45%) UDF & Transforms (20%) DataFrames (1%) DataFrames (8%) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Count CPU ChartTitle Query Count CPU Diversity of Workload
  • 34. Data Driven Decisions Standardized Testing Change X Standardized Tests Log Metrics Evaluate Results
  • 35. Data Driven Decisions Shadow Testing Change X Create a tag Shadow Testing Log Metrics Evaluate Results Tag based selection
  • 36. • New Features • Regular Releases • Configuration Updates • Hardware Testing Where do we use it?
  • 37. Workload Prioritization Spark Cluster Team 1 Team 2 BackfillPipelinesFastlane Interactive (FIFO) (User Fair Share)(DRF) (FIFO) (FIFO) (User Fair Share) 60% 40%
  • 38. • Hard limits on config values • Capacity Quotas (Storage and Compute) • Strict resource limits (containerization) Defensive Deployment Guardrails for us (and users)
  • 39. Resource Limits Cgroup v2 Spark Executor /cgroup2/task_container/exec1
  • 40. Memory Oversubscription Finding the balance /cgroup2/task_container/ exec1 exec2 exec3 exec4 40 GB 12 GB 12 GB 12 GB 12 GB memory.max
  • 41. A tale of two resources CPU Memory ResourceUtilization% Mar 2nd – Mar 10th 2019
  • 42. A tale of THREE resources Or my love-hate relationship with cgroups CPU Memory Disk IO ResourceUtilization% Mar 2nd – Mar 10th 2019
  • 43. Memory Oversubscription Finding the balance /cgroup2/task_container/ exec1 exec2 exec3 exec4 40 GB 12 GB 12 GB 12 GB 12 GB memory.max
  • 44. Memory Oversubscription The full picture /cgroup2/task_container/ exec1 exec2 exec3 exec4 40 GB 12 GB 12 GB 12 GB 12 GB memory.max 10 GB 10 GB 10 GB 10 GB memory.high
  • 45. Memory.what? memory.high is the memory usage throttle limit. This is the main mechanism to control a cgroup’s memory use. If a cgroup's memory use goes over the high boundary specified here, the cgroup’s processes are throttled and put under heavy reclaim pressure. The default is max, meaning there is no limit. memory.max is the memory usage hard limit, acting as the final protection mechanism: If a cgroup's memory usage reaches this limit and can't be reduced, the system OOM killer is invoked on the cgroup.
  • 47. Memory Oversubscription The full picture /cgroup2/task_container/ exec1 exec2 exec3 exec4 40 GB 12 GB 12 GB 12 GB 12 GB memory.max 4 GB 4 GB 4 GB 4 GB memory.high
  • 49. • Our cgroup configuration was wrong • History Based scheduling So… What happened?
  • 50. • Cgroups configuration can be tricky • Find the right balance between efficiency and reliability • Bonus: Better resource control on IO Takeaways
  • 51. 1. Spark at Facebook 2. Hardware Trends: A tale of two bottlenecks 3. Evolving the Core Engine - History Based Tuning - Join Optimizations 4. Our Users and their Use-cases 5. The Road Ahead Agenda
  • 52. • Scaling Spark 10X • Redefining “Warehouse” • Beyond SQL The Road Ahead