SlideShare une entreprise Scribd logo
1  sur  38
Télécharger pour lire hors ligne
Edwina Lu and Ye Zhou,
Metrics-Driven Tuning of
Apache Spark at Scale
#Exp7SAIS
Hadoop Infra @ LinkedIn
• 10+ clusters
• 10,000+ nodes
• 1000+ users
#Exp7SAIS 2
Number of daily Spark apps for one cluster: close to
3K, a 2.4x increase in last 3 quarters
Spark applications consume 25% of resources,
average daily Spark resource consumption: 1.6 PBHr
#Exp7SAIS 3
Spark @ LinkedIn
0
500
1000
1500
2000
2500
3000
Septem
ber
O
ctoberN
ovem
berD
ecem
ber
January
Febuary
M
arch
April
M
ay
Number of Applications per Day
0.00
1.00
2.00
3.00
4.00
5.00
6.00
7.00
8.00
N
ovem
ber
D
ecem
ber
January
Feburary
M
arch
April
M
ay
Average Daily Resource Usage
Spark Non-Spark
What We Discovered About Spark Usage
Only ~34% of allocated memory was
actually used.
Example application:
§ 200 executors
§ spark.driver.memory: 16GB
§ spark.executor.memory: 16GB
§ Max executor JVM used memory: 6.6GB
§ Max driver JVM used memory: 5.4GB
§ Total wasted memory: 1.8TB
§ Time: 1h
#Exp7SAIS 4
34%
61%
5%
Executor Memory
Peak Used JVM
Memory
Unused Executor
Memory
Reserved Memory
Memory Tuning: Motivation
• Memory and CPUs cost money
• These are limited resources, so must be used efficiently
• With 34% of allocated memory used, if memory usage is more efficient,
we can run 2-3 times as many Spark applications on the same
hardware
#Exp7SAIS 5
Memory Tuning: What and How to Tune?
• Spark tuning can be
complicated, with many
metrics and
configuration
parameters
• Many users have
limited knowledge
about how to tune
Spark applications
#Exp7SAIS 6
Memory Tuning: Scaling
• Data scientist and engineer time cost even more money
• Analyzing applications and giving tuning advice in person does not
scale for the Spark team or users who must wait for help
• Infrastructure efficiency vs. developer productivity
– Do we have to choose between these two?
#Exp7SAIS 7
Dr. Elephant
• Performance monitoring and tuning service
• Identify badly tuned applications and causes
• Provide actionable advice for fixing issues
• Compare performance changes over time
#Exp7SAIS 8
Dr. Elephant: How does it Work?
#Exp7SAIS 9
Metrics
Fetcher
History
Server
Application
Fetcher
Resource
Manager
Run
Rule 1
Run
Rule 2
Run
Rule 3
Database
Dr. Elephant UI
Challenges for Dr. Elephant to Support Spark
• Spark tuning heuristics
– What are the necessary metrics to enable effective tuning?
• Fetch Spark history
– Spark components are not equally scalable
#Exp7SAIS 10
Spark Memory Overview
#Exp7SAIS 11
Executor Memory
spark.executor.memory
Overhead (off-heap memory)
spark.yarn.executor.memoryOverhead
max(executorMemory * 0.1, 384MB)
Execution Memory Storage Memory
spark.memory.storageFraction
Reserved Memory
300 MB
User Memory
1 – spark.memory.fraction = 0.4
Executor Container
UNIFIED MEMORY
spark.memory.fraction = 0.6
JVMUSEDMEMORY
EXECUTORMEMORY
Executor JVM Used Memory Heuristic
Spark
Executor
Memory
Peak JVM
Used Memory
Reserved
Memory
16GB
275.9
MB
300
MB
WastedMemory
Executor JVM Used Memory
Severity: Severe
The configured executor memory is much higher than the
maximum amount of JVM used by executors. Please set
spark.executor.memory to a lower value.
spark.executor.memory: 16 GB
Max executor peak JVM used memory: 6.6 GB
Suggested spark.executor.memory: 7 GB
#Exp7SAIS 12
Executor Unified Memory Heuristic
Unified
Memory
Peak
Unified
Memory
8.36
GB
474.42KBWastedMemory
Executor Peak Unified Memory
Severity: Critical
The allocated unified memory is much higher than the maximum
amount of unified memory used by executors. Please lower
spark.memory.fraction.
spark.executor.memory: 10 GB
spark.memory.fraction: 0.6
Allocated unified memory: 6 GB
Max peak JVM used memory: 7.2 GB
Max peak unified memory: 1.2 GB
Suggested spark.memory.fraction: 0.2
#Exp7SAIS 13
Execution Memory Spill Heuristic
Disk
Executor
Memory Unified
Memory
Execution Memory Spill
Severity: Severe
Execution memory spill has been detected in stage 3. Shuffle read
bytes and spill are evenly distributed.There are 200 tasks for this stage.
Please increase spark.sql.shuffle.partitions, or modify the code to use
more partitions, or reduce the number of executor cores.
spark.executor.memory 10 GB
spark.executor.cores 3
spark.executor.instances 300
Stage 3:
Median shuffle read bytes: 954 MB
Max shuffle read bytes: 955 MB
Median shuffle write bytes: 359 MB
Max shuffle write bytes: 388 MB
Median memoryBytesSpilled: 1.2 GB
Max memoryBytesSpilled: 1.2 GB
Num tasks: 200
#Exp7SAIS 14
Executor GC Heuristic
13 Seconds
2 Minutes
Executor Runtime
GCTime
Executor GC
Severity: Moderate
Executors are spending too much time in GC. Please increase
spark.executor.memory.
Spark.executor.memory: 4 GB
GC time to executor run time ratio: 0.164
Total executor run time: 1 Hour 15 Minutes
Total GC time: 12 Minutes
#Exp7SAIS 15
Automating Spark Tuning with Dr. Elephant @ LinkedIn
Well Tuned? Ship It!
Production
Tune It!
Yes
No
Development
#Exp7SAIS 16
Architecture
Executor
Task Task
Cache Driver
Task Scheduler
Listener Bus
Executor
Task Task
Cache
Executor
Task Task
Cache
HDFS
Spark
History Logs
Spark
History
Server
DAG Scheduler
EventLogging
Listener
AppState
Listener
Task
Heartbeats
Task
Task
Heartbeats
Heartbeats REST
API
Web
UI
#Exp7SAIS 17
Upstream Ticket
SPARK-23206: Additional Memory Tuning Metrics
• New executor level memory metrics:
– JVM used memory
– Execution memory
– Storage memory
– Unified memory
• Metrics sent from executors to driver via Heartbeat
• Peak values for executor metrics logged at stage end
• Metrics exposed via web UI and REST API
#Exp7SAIS 18
Overview of our Solution
Scalable
application
metrics provider
Spark History
Server (SHS)
Enhancements
on SHS
Benefits brought by
enhanced SHS
Scalable
application history
provider
Dr Elephant
Performance
analysis at scale
Debug
Easy investigation
of past applications
#Exp7SAIS 19
Spark History Server (SHS) at LinkedIn
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
Log Parsing
Web UI Rest APIs
#Exp7SAIS 20
How does SHS work?
Apps DBsListing DB
Queued
Thread Pool
Update
Jetty Handlers
Thread Pool
Createhttp://www.yoursite.com
http://www.yoursite.com
http://www.yoursite.com
SHS
SPARK-18085
#Exp7SAIS 21
Not Happy
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
Log Parsing
Web UI
Rest APIs
#Exp7SAIS 22
SHS Issues
• Missing applications
– Users cannot find their applications on the home page
• Extended loading time
– Application details page take a very long time (up to 0.5 hour) to load
• Handling large history files
– SHS gets completely stalled
• Handling high-volume concurrent requests
– SHS doesn’t return expected JSON response
23#Exp7SAIS
Missing Applications
1
2
3
4Submit Job
Start running
Job Failed
Check it out on
SHS
#Exp7SAIS 24
5
6
7
8Wait SHS to catch
up
Finally it shows
up
Check out the
details
Keep loading…
No response
Extended Loading Time
#Exp7SAIS 25
Extended Listing Delay
Listing DB
History Files
Update
1. Replay same file multiple times
2. Limited threads for the replay
3. Processing time proportional to file size
#Exp7SAIS 26
How to Decrease the Listing Delay
Listing DB
Read from
extended
attributes
Spark
Driver
Write log file content
Write log file extended
attributes key/value
Read from log content
when fail to read from
extended attributes
1
2
NameNode
• Use HDFS Extended Attributes
#Exp7SAIS 27
Extended Loading Delay
Apps DBs
Request
Response
SHS
Replaying all the events takes a long time for large log file
Replay
#Exp7SAIS 28
How to Decrease the Loading Delay
• DB creation time is unavoidable
• Start DB creation prior to User’s request for every application log file
Apps DBs
Request
SHS
Replay
Request
Response
#Exp7SAIS 29
Results of Improvement
• SHS can get the completed/running application information into home
page within 1 minute.
• Start to create DBs in 5 minutes for 90% applications right after they finish
#Exp7SAIS 30
Scalability Issues
• Increasing number of
Spark applications
• Increasing Spark users
#Exp7SAIS 31
Severe Garbage Collection (GC)
Full GC Full GC Full GCFull GC
#Exp7SAIS 32
What Caused GC?
• Unnecessary events used too
much memory while replaying
• SHS got completely stalled
• SHS needs to ignore those
unnecessary events
#Exp7SAIS 33
23GB
High-Volume Concurrent Requests
• When REST call frequency
goes beyond certain threshold,
SHS is likely to return non-
JSON response to users
• Home page shows empty list
#Exp7SAIS 34
Upstream Tickets
SPARK-23607: Use HDFS extended attributes to store application
summary
SPARK-21961: Filter out BlockStatusUpdates in History Server
when analyzing logs
SPARK-23608: Synchronize when attaching and detaching
SparkUI in History Server
#Exp7SAIS 35
Results
• User can always find their applications on
SHS home page within 1 minute
• For 90% of applications DBs, SHS will start
creating them within 5 minutes after they
complete
• Stable and reliable service
• Handle high-volume concurrent requests
#Exp7SAIS 36
Future Work
• More memory metrics:
– Netty memory
– Total memory
• More Tuning:
– Skew in assignment of tasks to executors
– Size/time skew in tasks for a stage
– DAG analysis
• Incremental Replay for History Logs
• Horizontal Scalable History Server
#Exp7SAIS 37
Q&A
#Exp7SAIS 38

Contenu connexe

Tendances

Interoperating a Zoo of Data Processing Platforms Using with Rheem Sebastian ...
Interoperating a Zoo of Data Processing Platforms Using with Rheem Sebastian ...Interoperating a Zoo of Data Processing Platforms Using with Rheem Sebastian ...
Interoperating a Zoo of Data Processing Platforms Using with Rheem Sebastian ...Databricks
 
Apache Spark Performance is too hard. Let's make it easier
Apache Spark Performance is too hard. Let's make it easierApache Spark Performance is too hard. Let's make it easier
Apache Spark Performance is too hard. Let's make it easierDatabricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkDatabricks
 
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...Databricks
 
Experiences Migrating Hive Workload to SparkSQL with Jie Xiong and Zhan Zhang
Experiences Migrating Hive Workload to SparkSQL with Jie Xiong and Zhan ZhangExperiences Migrating Hive Workload to SparkSQL with Jie Xiong and Zhan Zhang
Experiences Migrating Hive Workload to SparkSQL with Jie Xiong and Zhan ZhangDatabricks
 
Building Operational Data Lake using Spark and SequoiaDB with Yang Peng
Building Operational Data Lake using Spark and SequoiaDB with Yang PengBuilding Operational Data Lake using Spark and SequoiaDB with Yang Peng
Building Operational Data Lake using Spark and SequoiaDB with Yang PengDatabricks
 
Spark Summit EU talk by Kaarthik Sivashanmugam
Spark Summit EU talk by Kaarthik SivashanmugamSpark Summit EU talk by Kaarthik Sivashanmugam
Spark Summit EU talk by Kaarthik SivashanmugamSpark Summit
 
Apache Kylin: Speed Up Cubing with Apache Spark with Luke Han and Shaofeng Shi
 Apache Kylin: Speed Up Cubing with Apache Spark with Luke Han and Shaofeng Shi Apache Kylin: Speed Up Cubing with Apache Spark with Luke Han and Shaofeng Shi
Apache Kylin: Speed Up Cubing with Apache Spark with Luke Han and Shaofeng ShiDatabricks
 
Building a Unified Data Pipeline with Apache Spark and XGBoost with Nan Zhu
Building a Unified Data Pipeline with Apache Spark and XGBoost with Nan ZhuBuilding a Unified Data Pipeline with Apache Spark and XGBoost with Nan Zhu
Building a Unified Data Pipeline with Apache Spark and XGBoost with Nan ZhuDatabricks
 
Apache Spark on K8S Best Practice and Performance in the Cloud
Apache Spark on K8S Best Practice and Performance in the CloudApache Spark on K8S Best Practice and Performance in the Cloud
Apache Spark on K8S Best Practice and Performance in the CloudDatabricks
 
Sputnik: Airbnb’s Apache Spark Framework for Data Engineering
Sputnik: Airbnb’s Apache Spark Framework for Data EngineeringSputnik: Airbnb’s Apache Spark Framework for Data Engineering
Sputnik: Airbnb’s Apache Spark Framework for Data EngineeringDatabricks
 
SSR: Structured Streaming for R and Machine Learning
SSR: Structured Streaming for R and Machine LearningSSR: Structured Streaming for R and Machine Learning
SSR: Structured Streaming for R and Machine Learningfelixcss
 
Accelerating Real Time Analytics with Spark Streaming and FPGAaaS with Prabha...
Accelerating Real Time Analytics with Spark Streaming and FPGAaaS with Prabha...Accelerating Real Time Analytics with Spark Streaming and FPGAaaS with Prabha...
Accelerating Real Time Analytics with Spark Streaming and FPGAaaS with Prabha...Databricks
 
Spark and S3 with Ryan Blue
Spark and S3 with Ryan BlueSpark and S3 with Ryan Blue
Spark and S3 with Ryan BlueDatabricks
 
Overview of Apache Spark 2.3: What’s New? with Sameer Agarwal
 Overview of Apache Spark 2.3: What’s New? with Sameer Agarwal Overview of Apache Spark 2.3: What’s New? with Sameer Agarwal
Overview of Apache Spark 2.3: What’s New? with Sameer AgarwalDatabricks
 
Extending the R API for Spark with sparklyr and Microsoft R Server with Ali Z...
Extending the R API for Spark with sparklyr and Microsoft R Server with Ali Z...Extending the R API for Spark with sparklyr and Microsoft R Server with Ali Z...
Extending the R API for Spark with sparklyr and Microsoft R Server with Ali Z...Databricks
 
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...Databricks
 
Jump Start on Apache® Spark™ 2.x with Databricks
Jump Start on Apache® Spark™ 2.x with Databricks Jump Start on Apache® Spark™ 2.x with Databricks
Jump Start on Apache® Spark™ 2.x with Databricks Databricks
 
Performant Streaming in Production: Preventing Common Pitfalls when Productio...
Performant Streaming in Production: Preventing Common Pitfalls when Productio...Performant Streaming in Production: Preventing Common Pitfalls when Productio...
Performant Streaming in Production: Preventing Common Pitfalls when Productio...Databricks
 
Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...
Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...
Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...Databricks
 

Tendances (20)

Interoperating a Zoo of Data Processing Platforms Using with Rheem Sebastian ...
Interoperating a Zoo of Data Processing Platforms Using with Rheem Sebastian ...Interoperating a Zoo of Data Processing Platforms Using with Rheem Sebastian ...
Interoperating a Zoo of Data Processing Platforms Using with Rheem Sebastian ...
 
Apache Spark Performance is too hard. Let's make it easier
Apache Spark Performance is too hard. Let's make it easierApache Spark Performance is too hard. Let's make it easier
Apache Spark Performance is too hard. Let's make it easier
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
 
Experiences Migrating Hive Workload to SparkSQL with Jie Xiong and Zhan Zhang
Experiences Migrating Hive Workload to SparkSQL with Jie Xiong and Zhan ZhangExperiences Migrating Hive Workload to SparkSQL with Jie Xiong and Zhan Zhang
Experiences Migrating Hive Workload to SparkSQL with Jie Xiong and Zhan Zhang
 
Building Operational Data Lake using Spark and SequoiaDB with Yang Peng
Building Operational Data Lake using Spark and SequoiaDB with Yang PengBuilding Operational Data Lake using Spark and SequoiaDB with Yang Peng
Building Operational Data Lake using Spark and SequoiaDB with Yang Peng
 
Spark Summit EU talk by Kaarthik Sivashanmugam
Spark Summit EU talk by Kaarthik SivashanmugamSpark Summit EU talk by Kaarthik Sivashanmugam
Spark Summit EU talk by Kaarthik Sivashanmugam
 
Apache Kylin: Speed Up Cubing with Apache Spark with Luke Han and Shaofeng Shi
 Apache Kylin: Speed Up Cubing with Apache Spark with Luke Han and Shaofeng Shi Apache Kylin: Speed Up Cubing with Apache Spark with Luke Han and Shaofeng Shi
Apache Kylin: Speed Up Cubing with Apache Spark with Luke Han and Shaofeng Shi
 
Building a Unified Data Pipeline with Apache Spark and XGBoost with Nan Zhu
Building a Unified Data Pipeline with Apache Spark and XGBoost with Nan ZhuBuilding a Unified Data Pipeline with Apache Spark and XGBoost with Nan Zhu
Building a Unified Data Pipeline with Apache Spark and XGBoost with Nan Zhu
 
Apache Spark on K8S Best Practice and Performance in the Cloud
Apache Spark on K8S Best Practice and Performance in the CloudApache Spark on K8S Best Practice and Performance in the Cloud
Apache Spark on K8S Best Practice and Performance in the Cloud
 
Sputnik: Airbnb’s Apache Spark Framework for Data Engineering
Sputnik: Airbnb’s Apache Spark Framework for Data EngineeringSputnik: Airbnb’s Apache Spark Framework for Data Engineering
Sputnik: Airbnb’s Apache Spark Framework for Data Engineering
 
SSR: Structured Streaming for R and Machine Learning
SSR: Structured Streaming for R and Machine LearningSSR: Structured Streaming for R and Machine Learning
SSR: Structured Streaming for R and Machine Learning
 
Accelerating Real Time Analytics with Spark Streaming and FPGAaaS with Prabha...
Accelerating Real Time Analytics with Spark Streaming and FPGAaaS with Prabha...Accelerating Real Time Analytics with Spark Streaming and FPGAaaS with Prabha...
Accelerating Real Time Analytics with Spark Streaming and FPGAaaS with Prabha...
 
Spark and S3 with Ryan Blue
Spark and S3 with Ryan BlueSpark and S3 with Ryan Blue
Spark and S3 with Ryan Blue
 
Overview of Apache Spark 2.3: What’s New? with Sameer Agarwal
 Overview of Apache Spark 2.3: What’s New? with Sameer Agarwal Overview of Apache Spark 2.3: What’s New? with Sameer Agarwal
Overview of Apache Spark 2.3: What’s New? with Sameer Agarwal
 
Extending the R API for Spark with sparklyr and Microsoft R Server with Ali Z...
Extending the R API for Spark with sparklyr and Microsoft R Server with Ali Z...Extending the R API for Spark with sparklyr and Microsoft R Server with Ali Z...
Extending the R API for Spark with sparklyr and Microsoft R Server with Ali Z...
 
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...
Dr. Elephant for Monitoring and Tuning Apache Spark Jobs on Hadoop with Carl ...
 
Jump Start on Apache® Spark™ 2.x with Databricks
Jump Start on Apache® Spark™ 2.x with Databricks Jump Start on Apache® Spark™ 2.x with Databricks
Jump Start on Apache® Spark™ 2.x with Databricks
 
Performant Streaming in Production: Preventing Common Pitfalls when Productio...
Performant Streaming in Production: Preventing Common Pitfalls when Productio...Performant Streaming in Production: Preventing Common Pitfalls when Productio...
Performant Streaming in Production: Preventing Common Pitfalls when Productio...
 
Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...
Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...
Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...
 

Similaire à Metrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye Zhou

Metrics-driven tuning of Apache Spark at scale
Metrics-driven tuning of Apache Spark at scaleMetrics-driven tuning of Apache Spark at scale
Metrics-driven tuning of Apache Spark at scaleDataWorks Summit
 
Conquering Hadoop and Apache Spark with Operational Intelligence with Akshay Rai
Conquering Hadoop and Apache Spark with Operational Intelligence with Akshay RaiConquering Hadoop and Apache Spark with Operational Intelligence with Akshay Rai
Conquering Hadoop and Apache Spark with Operational Intelligence with Akshay RaiDatabricks
 
Migrating Complex Data Aggregation from Hadoop to Spark-(Ashish Singh andPune...
Migrating Complex Data Aggregation from Hadoop to Spark-(Ashish Singh andPune...Migrating Complex Data Aggregation from Hadoop to Spark-(Ashish Singh andPune...
Migrating Complex Data Aggregation from Hadoop to Spark-(Ashish Singh andPune...Spark Summit
 
Spark Summit EU talk by Luca Canali
Spark Summit EU talk by Luca CanaliSpark Summit EU talk by Luca Canali
Spark Summit EU talk by Luca CanaliSpark Summit
 
Dr. Elephant – Achieving Quicker, Easier, and Cost-Effective Big Data Analyti...
Dr. Elephant – Achieving Quicker, Easier, and Cost-Effective Big Data Analyti...Dr. Elephant – Achieving Quicker, Easier, and Cost-Effective Big Data Analyti...
Dr. Elephant – Achieving Quicker, Easier, and Cost-Effective Big Data Analyti...Akshay Rai
 
Scaling Apache Spark at Facebook
Scaling Apache Spark at FacebookScaling Apache Spark at Facebook
Scaling Apache Spark at FacebookDatabricks
 
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital KediaTuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital KediaDatabricks
 
SenchaCon 2016: Turbocharge your Ext JS App - Per Minborg, Anselm McClain, Jo...
SenchaCon 2016: Turbocharge your Ext JS App - Per Minborg, Anselm McClain, Jo...SenchaCon 2016: Turbocharge your Ext JS App - Per Minborg, Anselm McClain, Jo...
SenchaCon 2016: Turbocharge your Ext JS App - Per Minborg, Anselm McClain, Jo...Sencha
 
Optimizing Performance and Computing Resource Efficiency of In-Memory Big Dat...
Optimizing Performance and Computing Resource Efficiency of In-Memory Big Dat...Optimizing Performance and Computing Resource Efficiency of In-Memory Big Dat...
Optimizing Performance and Computing Resource Efficiency of In-Memory Big Dat...Databricks
 
Leveraging Databricks for Spark Pipelines
Leveraging Databricks for Spark PipelinesLeveraging Databricks for Spark Pipelines
Leveraging Databricks for Spark PipelinesRose Toomey
 
Leveraging Databricks for Spark pipelines
Leveraging Databricks for Spark pipelinesLeveraging Databricks for Spark pipelines
Leveraging Databricks for Spark pipelinesRose Toomey
 
Headaches and Breakthroughs in Building Continuous Applications
Headaches and Breakthroughs in Building Continuous ApplicationsHeadaches and Breakthroughs in Building Continuous Applications
Headaches and Breakthroughs in Building Continuous ApplicationsDatabricks
 
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...Landon Robinson
 
Capacity Planning
Capacity PlanningCapacity Planning
Capacity PlanningMongoDB
 
Monitor Apache Spark 3 on Kubernetes using Metrics and Plugins
Monitor Apache Spark 3 on Kubernetes using Metrics and PluginsMonitor Apache Spark 3 on Kubernetes using Metrics and Plugins
Monitor Apache Spark 3 on Kubernetes using Metrics and PluginsDatabricks
 
Apache Spark At Apple with Sam Maclennan and Vishwanath Lakkundi
Apache Spark At Apple with Sam Maclennan and Vishwanath LakkundiApache Spark At Apple with Sam Maclennan and Vishwanath Lakkundi
Apache Spark At Apple with Sam Maclennan and Vishwanath LakkundiDatabricks
 
Spark summit 2019 infrastructure for deep learning in apache spark 0425
Spark summit 2019 infrastructure for deep learning in apache spark 0425Spark summit 2019 infrastructure for deep learning in apache spark 0425
Spark summit 2019 infrastructure for deep learning in apache spark 0425Wee Hyong Tok
 
Spark on Mesos
Spark on MesosSpark on Mesos
Spark on MesosJen Aman
 

Similaire à Metrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye Zhou (20)

Metrics-driven tuning of Apache Spark at scale
Metrics-driven tuning of Apache Spark at scaleMetrics-driven tuning of Apache Spark at scale
Metrics-driven tuning of Apache Spark at scale
 
Conquering Hadoop and Apache Spark with Operational Intelligence with Akshay Rai
Conquering Hadoop and Apache Spark with Operational Intelligence with Akshay RaiConquering Hadoop and Apache Spark with Operational Intelligence with Akshay Rai
Conquering Hadoop and Apache Spark with Operational Intelligence with Akshay Rai
 
Migrating Complex Data Aggregation from Hadoop to Spark-(Ashish Singh andPune...
Migrating Complex Data Aggregation from Hadoop to Spark-(Ashish Singh andPune...Migrating Complex Data Aggregation from Hadoop to Spark-(Ashish Singh andPune...
Migrating Complex Data Aggregation from Hadoop to Spark-(Ashish Singh andPune...
 
Spark Uber Development Kit
Spark Uber Development KitSpark Uber Development Kit
Spark Uber Development Kit
 
Spark Summit EU talk by Luca Canali
Spark Summit EU talk by Luca CanaliSpark Summit EU talk by Luca Canali
Spark Summit EU talk by Luca Canali
 
Dr. Elephant – Achieving Quicker, Easier, and Cost-Effective Big Data Analyti...
Dr. Elephant – Achieving Quicker, Easier, and Cost-Effective Big Data Analyti...Dr. Elephant – Achieving Quicker, Easier, and Cost-Effective Big Data Analyti...
Dr. Elephant – Achieving Quicker, Easier, and Cost-Effective Big Data Analyti...
 
Scaling Apache Spark at Facebook
Scaling Apache Spark at FacebookScaling Apache Spark at Facebook
Scaling Apache Spark at Facebook
 
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital KediaTuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
 
SenchaCon 2016: Turbocharge your Ext JS App - Per Minborg, Anselm McClain, Jo...
SenchaCon 2016: Turbocharge your Ext JS App - Per Minborg, Anselm McClain, Jo...SenchaCon 2016: Turbocharge your Ext JS App - Per Minborg, Anselm McClain, Jo...
SenchaCon 2016: Turbocharge your Ext JS App - Per Minborg, Anselm McClain, Jo...
 
Optimizing Performance and Computing Resource Efficiency of In-Memory Big Dat...
Optimizing Performance and Computing Resource Efficiency of In-Memory Big Dat...Optimizing Performance and Computing Resource Efficiency of In-Memory Big Dat...
Optimizing Performance and Computing Resource Efficiency of In-Memory Big Dat...
 
Leveraging Databricks for Spark Pipelines
Leveraging Databricks for Spark PipelinesLeveraging Databricks for Spark Pipelines
Leveraging Databricks for Spark Pipelines
 
Leveraging Databricks for Spark pipelines
Leveraging Databricks for Spark pipelinesLeveraging Databricks for Spark pipelines
Leveraging Databricks for Spark pipelines
 
Headaches and Breakthroughs in Building Continuous Applications
Headaches and Breakthroughs in Building Continuous ApplicationsHeadaches and Breakthroughs in Building Continuous Applications
Headaches and Breakthroughs in Building Continuous Applications
 
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...
Spark + AI Summit 2019: Headaches and Breakthroughs in Building Continuous Ap...
 
Capacity Planning
Capacity PlanningCapacity Planning
Capacity Planning
 
Monitor Apache Spark 3 on Kubernetes using Metrics and Plugins
Monitor Apache Spark 3 on Kubernetes using Metrics and PluginsMonitor Apache Spark 3 on Kubernetes using Metrics and Plugins
Monitor Apache Spark 3 on Kubernetes using Metrics and Plugins
 
Apache Spark At Apple with Sam Maclennan and Vishwanath Lakkundi
Apache Spark At Apple with Sam Maclennan and Vishwanath LakkundiApache Spark At Apple with Sam Maclennan and Vishwanath Lakkundi
Apache Spark At Apple with Sam Maclennan and Vishwanath Lakkundi
 
Spark summit 2019 infrastructure for deep learning in apache spark 0425
Spark summit 2019 infrastructure for deep learning in apache spark 0425Spark summit 2019 infrastructure for deep learning in apache spark 0425
Spark summit 2019 infrastructure for deep learning in apache spark 0425
 
Spark on Mesos
Spark on MesosSpark on Mesos
Spark on Mesos
 
Typesafe spark- Zalando meetup
Typesafe spark- Zalando meetupTypesafe spark- Zalando meetup
Typesafe spark- Zalando meetup
 

Plus de Databricks

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDatabricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Databricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of HadoopDatabricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDatabricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceDatabricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringDatabricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixDatabricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationDatabricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchDatabricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesDatabricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesDatabricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsDatabricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkDatabricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesDatabricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkDatabricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeDatabricks
 
Machine Learning CI/CD for Email Attack Detection
Machine Learning CI/CD for Email Attack DetectionMachine Learning CI/CD for Email Attack Detection
Machine Learning CI/CD for Email Attack DetectionDatabricks
 

Plus de Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 
Machine Learning CI/CD for Email Attack Detection
Machine Learning CI/CD for Email Attack DetectionMachine Learning CI/CD for Email Attack Detection
Machine Learning CI/CD for Email Attack Detection
 

Dernier

Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...gajnagarg
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制vexqp
 
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样wsppdmt
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...nirzagarg
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样wsppdmt
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRajesh Mondal
 
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptxThe-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptxVivek487417
 
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制vexqp
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制vexqp
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...Health
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxchadhar227
 
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制vexqp
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...nirzagarg
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1ranjankumarbehera14
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...gajnagarg
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 

Dernier (20)

Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
 
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptxThe-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
 
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 

Metrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye Zhou

  • 1. Edwina Lu and Ye Zhou, Metrics-Driven Tuning of Apache Spark at Scale #Exp7SAIS
  • 2. Hadoop Infra @ LinkedIn • 10+ clusters • 10,000+ nodes • 1000+ users #Exp7SAIS 2
  • 3. Number of daily Spark apps for one cluster: close to 3K, a 2.4x increase in last 3 quarters Spark applications consume 25% of resources, average daily Spark resource consumption: 1.6 PBHr #Exp7SAIS 3 Spark @ LinkedIn 0 500 1000 1500 2000 2500 3000 Septem ber O ctoberN ovem berD ecem ber January Febuary M arch April M ay Number of Applications per Day 0.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 N ovem ber D ecem ber January Feburary M arch April M ay Average Daily Resource Usage Spark Non-Spark
  • 4. What We Discovered About Spark Usage Only ~34% of allocated memory was actually used. Example application: § 200 executors § spark.driver.memory: 16GB § spark.executor.memory: 16GB § Max executor JVM used memory: 6.6GB § Max driver JVM used memory: 5.4GB § Total wasted memory: 1.8TB § Time: 1h #Exp7SAIS 4 34% 61% 5% Executor Memory Peak Used JVM Memory Unused Executor Memory Reserved Memory
  • 5. Memory Tuning: Motivation • Memory and CPUs cost money • These are limited resources, so must be used efficiently • With 34% of allocated memory used, if memory usage is more efficient, we can run 2-3 times as many Spark applications on the same hardware #Exp7SAIS 5
  • 6. Memory Tuning: What and How to Tune? • Spark tuning can be complicated, with many metrics and configuration parameters • Many users have limited knowledge about how to tune Spark applications #Exp7SAIS 6
  • 7. Memory Tuning: Scaling • Data scientist and engineer time cost even more money • Analyzing applications and giving tuning advice in person does not scale for the Spark team or users who must wait for help • Infrastructure efficiency vs. developer productivity – Do we have to choose between these two? #Exp7SAIS 7
  • 8. Dr. Elephant • Performance monitoring and tuning service • Identify badly tuned applications and causes • Provide actionable advice for fixing issues • Compare performance changes over time #Exp7SAIS 8
  • 9. Dr. Elephant: How does it Work? #Exp7SAIS 9 Metrics Fetcher History Server Application Fetcher Resource Manager Run Rule 1 Run Rule 2 Run Rule 3 Database Dr. Elephant UI
  • 10. Challenges for Dr. Elephant to Support Spark • Spark tuning heuristics – What are the necessary metrics to enable effective tuning? • Fetch Spark history – Spark components are not equally scalable #Exp7SAIS 10
  • 11. Spark Memory Overview #Exp7SAIS 11 Executor Memory spark.executor.memory Overhead (off-heap memory) spark.yarn.executor.memoryOverhead max(executorMemory * 0.1, 384MB) Execution Memory Storage Memory spark.memory.storageFraction Reserved Memory 300 MB User Memory 1 – spark.memory.fraction = 0.4 Executor Container UNIFIED MEMORY spark.memory.fraction = 0.6 JVMUSEDMEMORY EXECUTORMEMORY
  • 12. Executor JVM Used Memory Heuristic Spark Executor Memory Peak JVM Used Memory Reserved Memory 16GB 275.9 MB 300 MB WastedMemory Executor JVM Used Memory Severity: Severe The configured executor memory is much higher than the maximum amount of JVM used by executors. Please set spark.executor.memory to a lower value. spark.executor.memory: 16 GB Max executor peak JVM used memory: 6.6 GB Suggested spark.executor.memory: 7 GB #Exp7SAIS 12
  • 13. Executor Unified Memory Heuristic Unified Memory Peak Unified Memory 8.36 GB 474.42KBWastedMemory Executor Peak Unified Memory Severity: Critical The allocated unified memory is much higher than the maximum amount of unified memory used by executors. Please lower spark.memory.fraction. spark.executor.memory: 10 GB spark.memory.fraction: 0.6 Allocated unified memory: 6 GB Max peak JVM used memory: 7.2 GB Max peak unified memory: 1.2 GB Suggested spark.memory.fraction: 0.2 #Exp7SAIS 13
  • 14. Execution Memory Spill Heuristic Disk Executor Memory Unified Memory Execution Memory Spill Severity: Severe Execution memory spill has been detected in stage 3. Shuffle read bytes and spill are evenly distributed.There are 200 tasks for this stage. Please increase spark.sql.shuffle.partitions, or modify the code to use more partitions, or reduce the number of executor cores. spark.executor.memory 10 GB spark.executor.cores 3 spark.executor.instances 300 Stage 3: Median shuffle read bytes: 954 MB Max shuffle read bytes: 955 MB Median shuffle write bytes: 359 MB Max shuffle write bytes: 388 MB Median memoryBytesSpilled: 1.2 GB Max memoryBytesSpilled: 1.2 GB Num tasks: 200 #Exp7SAIS 14
  • 15. Executor GC Heuristic 13 Seconds 2 Minutes Executor Runtime GCTime Executor GC Severity: Moderate Executors are spending too much time in GC. Please increase spark.executor.memory. Spark.executor.memory: 4 GB GC time to executor run time ratio: 0.164 Total executor run time: 1 Hour 15 Minutes Total GC time: 12 Minutes #Exp7SAIS 15
  • 16. Automating Spark Tuning with Dr. Elephant @ LinkedIn Well Tuned? Ship It! Production Tune It! Yes No Development #Exp7SAIS 16
  • 17. Architecture Executor Task Task Cache Driver Task Scheduler Listener Bus Executor Task Task Cache Executor Task Task Cache HDFS Spark History Logs Spark History Server DAG Scheduler EventLogging Listener AppState Listener Task Heartbeats Task Task Heartbeats Heartbeats REST API Web UI #Exp7SAIS 17
  • 18. Upstream Ticket SPARK-23206: Additional Memory Tuning Metrics • New executor level memory metrics: – JVM used memory – Execution memory – Storage memory – Unified memory • Metrics sent from executors to driver via Heartbeat • Peak values for executor metrics logged at stage end • Metrics exposed via web UI and REST API #Exp7SAIS 18
  • 19. Overview of our Solution Scalable application metrics provider Spark History Server (SHS) Enhancements on SHS Benefits brought by enhanced SHS Scalable application history provider Dr Elephant Performance analysis at scale Debug Easy investigation of past applications #Exp7SAIS 19
  • 20. Spark History Server (SHS) at LinkedIn >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Log Parsing Web UI Rest APIs #Exp7SAIS 20
  • 21. How does SHS work? Apps DBsListing DB Queued Thread Pool Update Jetty Handlers Thread Pool Createhttp://www.yoursite.com http://www.yoursite.com http://www.yoursite.com SHS SPARK-18085 #Exp7SAIS 21
  • 23. SHS Issues • Missing applications – Users cannot find their applications on the home page • Extended loading time – Application details page take a very long time (up to 0.5 hour) to load • Handling large history files – SHS gets completely stalled • Handling high-volume concurrent requests – SHS doesn’t return expected JSON response 23#Exp7SAIS
  • 24. Missing Applications 1 2 3 4Submit Job Start running Job Failed Check it out on SHS #Exp7SAIS 24
  • 25. 5 6 7 8Wait SHS to catch up Finally it shows up Check out the details Keep loading… No response Extended Loading Time #Exp7SAIS 25
  • 26. Extended Listing Delay Listing DB History Files Update 1. Replay same file multiple times 2. Limited threads for the replay 3. Processing time proportional to file size #Exp7SAIS 26
  • 27. How to Decrease the Listing Delay Listing DB Read from extended attributes Spark Driver Write log file content Write log file extended attributes key/value Read from log content when fail to read from extended attributes 1 2 NameNode • Use HDFS Extended Attributes #Exp7SAIS 27
  • 28. Extended Loading Delay Apps DBs Request Response SHS Replaying all the events takes a long time for large log file Replay #Exp7SAIS 28
  • 29. How to Decrease the Loading Delay • DB creation time is unavoidable • Start DB creation prior to User’s request for every application log file Apps DBs Request SHS Replay Request Response #Exp7SAIS 29
  • 30. Results of Improvement • SHS can get the completed/running application information into home page within 1 minute. • Start to create DBs in 5 minutes for 90% applications right after they finish #Exp7SAIS 30
  • 31. Scalability Issues • Increasing number of Spark applications • Increasing Spark users #Exp7SAIS 31
  • 32. Severe Garbage Collection (GC) Full GC Full GC Full GCFull GC #Exp7SAIS 32
  • 33. What Caused GC? • Unnecessary events used too much memory while replaying • SHS got completely stalled • SHS needs to ignore those unnecessary events #Exp7SAIS 33 23GB
  • 34. High-Volume Concurrent Requests • When REST call frequency goes beyond certain threshold, SHS is likely to return non- JSON response to users • Home page shows empty list #Exp7SAIS 34
  • 35. Upstream Tickets SPARK-23607: Use HDFS extended attributes to store application summary SPARK-21961: Filter out BlockStatusUpdates in History Server when analyzing logs SPARK-23608: Synchronize when attaching and detaching SparkUI in History Server #Exp7SAIS 35
  • 36. Results • User can always find their applications on SHS home page within 1 minute • For 90% of applications DBs, SHS will start creating them within 5 minutes after they complete • Stable and reliable service • Handle high-volume concurrent requests #Exp7SAIS 36
  • 37. Future Work • More memory metrics: – Netty memory – Total memory • More Tuning: – Skew in assignment of tasks to executors – Size/time skew in tasks for a stage – DAG analysis • Incremental Replay for History Logs • Horizontal Scalable History Server #Exp7SAIS 37