SlideShare une entreprise Scribd logo
1  sur  39
Télécharger pour lire hors ligne
Tracing the breadcrumbs:
Spark workload diagnostics
Kris Mok @rednaxelafx
Cheng Lian @liancheng
About us
▪ Sr. software engineer at Databricks
▪ Worked on OpenJDK HotSpot VM &
Zing VM implementations
Cheng Lian
▪ Sr. software engineer at Databricks
▪ Apache Spark PMC member
▪ Apache Parquet committer
Kris Mok
Unified data analytics platform for accelerating innovation across
data science, data engineering, and business analytics
Original creators of popular data and machine learning open source projects
Global company with 5,000 customers and 450+ partners
Distributed applications are hard
▪ Have you ever hit the following categories of issues in your distributed
Spark applications?
▪ Mysterious performance regression
▪ Mysterious job hang
Distributed applications are hard
▪ Distributed applications are inherently hard to
▪ Develop
▪ Tune
▪ Diagnose
▪ Due to
▪ Longer iteration cycle
▪ Variety of input data (volume, quality, distribution, etc.)
▪ Broader range of infra/external dependencies (network, cloud services, etc.)
▪ Fractured criminal scenes (incomplete and scattered logs)
What this talk is about
▪ Demonstrate tools and methodologies for diagnosing distributed Spark
applications with real world use cases.
Performance Regression
A query runs slower in a newer version of Spark than in an older version.
Check the Spark SQL query plan,
▪ If the query plan has changed
▪ the regression may likely be from an Catalyst optimizer change
▪ If the query plan is the same
▪ the regression could be coming from various sources
▪ More time spent in query optimization / compilation?
▪ More time spent in scheduling?
▪ More time spent in network operations?
▪ More time spent on task execution?
▪ More time spent on GC?
▪ ...
Symptom
Tools and methodologies demonstrated
▪ Build a benchmark to reproduce the performance regression
▪ Gather performance data using a profiler
▪ Conventional JVM profilers (JProfiler, YourKit, jvisualvm, etc.)
▪ Java Flight Recorder
▪ async-profiler and Flame Graph
Case study
▪ A performance regression in Spark 2.4 development
Symptom
Databricks’ Spark Benchmarking team’s performance sign-off for DBR
5.0-beta had found a significant performance regression vs DBR 4.3.
Multiple TPC-DS queries were slower, e.g. q67
FlameGraph: DBR 4.3 (Spark 2.3-based)
FlameGraph: DBR 5.0-beta (Spark 2.4-SNAPSHOT-based)
Zoom in on the difference in hot spot
DBR 4.3
DBR 5.0-beta
Zoom in on the difference in hot spot
DBR 4.3
hot loop calling monomorphic function;
also avoided extra buffer copy
hot loop calling polymorphic function;
extra buffer copyDBR 5.0-beta
Job hangs
Spark job
Symptoms
Tools and methodologies demonstrated
▪ Thread dump from Spark UI
▪ Network debugging
▪ JVM debugging
▪ Log exploration and visualization
Case study 1
▪ Shuffle fetch on a dead executor causes entire cluster hung
The customer found there are a lot of exceptions in the Spark logs:
org.apache.spark.rpc.RpcTimeoutException:
Cannot receive any reply from null in 120 seconds.
During about the same time (16:35 - 16:45) when the exceptions happened, the entire
cluster hung (extremely low cluster usage).
”
Early triaging questions
▪ Anything special happened before the cluster hang?
▪ That might be the cause of the hang.
▪ Anything happening while the cluster hung?
▪ Is it completely silent, or
▪ Is it busy doing something?
Tools
▪ Spark History Server
▪ Executor, job, stage, and task events visualization
▪ Spark logs in Delta Lake
▪ Within Databricks, with the consent of our customers, we ETL Spark logs into
Delta tables for fast exploration
▪ Interactive notebook environment for log exploration and visualization
Spark History Server - Historical Spark UI
▪ Executor 29 was removed around
16:36, right before the cluster
hung
▪ Executor 103 was added around
16:43, right before the cluster
went back active
Executor and job events
Cluster hanging
Spark logs exploration and visualization
▪ Checking per minute driver side
log message counts
▪ The driver turned out to be quiet
during the cluster hang
Checking driver activities
Cluster hanging
Spark logs exploration and visualization
▪ Checking per minute executor
side log message counts
▪ Mild executor side activities
during the cluster hang
Checking executors activities
Cluster hanging
Spark logs exploration and visualization
▪ Incrementally filter out logs of various retry events and timeout events
▪ Empty result set, so no new tasks scheduled during this time
▪ But already scheduled tasks could be running quietly
Zoom into the cluster hang period
Spark logs exploration and visualization
▪ Checking per minute executor
side log message counts
▪ A clear pattern repeated 3 times
every 120 seconds
▪ Turned out to be executor side
shuffle connection timeout events
Zoom into the cluster hang period
Conclusion
▪ The behavior is expected
▪ Later investigation revealed that
▪ A large stage consisting 2,000 tasks happened to occupy all CPU cores
▪ These tasks were waiting for shuffle map output from the failed executor until timeout
▪ So the cluster appeared to be “hanging”
▪ After executor 29 got lost, other active executors retried to connect to it for 3
times every 120 seconds, which conform to the default values of the following
two Spark configurations:
▪ spark.shuffle.io.connectionTimeout
▪ spark.shuffle.io.maxRetries
Case study 2
A customer reported that a Spark SQL query had been stuck for multiple
hours, and gave permissions to do live debugging.
Through Spark UI, we’ve determined that the query was almost done, and
the only tasks still running were all in the final stage, writing out results.
Get thread dump from executor via Spark UI
...
Relevant threads’ stack traces
Obviously stuck on a socket read
Find executor thread via thread name
Are there any zombie connections?
Run command: netstat -o
tcp6 0 0 <src_ip>:36332 <dest_ip>:https ESTABLISHED off (0.00/0/0)
Is the thread related to the zombie connection?
Introspect Java stack and objects via CLHSDB
(Command-Line HotSpot Debugger)
java -cp .:$JAVA_HOME/lib/sa-jdi.jar sun.jvm.hotspot.CLHSDB
CLHSDB session example
Inspect threads list (can also use jstack or pstack in CLHSDB)
hsdb> jseval "jvm.threads"
{Thread (address=0x00007f06648fae08, name=Attach Listener),
Thread (address=0x00007f065fd538b8, name=pool-339-thread-7),
Thread (address=0x00007f065fcb9cc8, name=pool-339-thread-6),
Thread (address=0x00007efc9249fef0, name=pool-339-thread-5),
Thread (address=0x00007efc9249e8b8, name=pool-339-thread-4),
...}
CLHSDB session example
Inspect stack frames of a given thread
hsdb> jseval "jvm.threads[4].frames"
{Frame
(method=java.net.SocketInputStream.socketRead0(java.io.FileDescriptor,
byte[], int, int, int), bci=0, line=0),
Frame
(method=java.net.SocketInputStream.socketRead(java.io.FileDescriptor,
byte[], int, int, int), bci=8, line=116),
...}
CLHSDB session example
Inspect the receiver object of a specific frame
hsdb> jseval
"sa.threads.first().next().next().next().next().getLastJavaVFrameDbg().javaSender().java
Sender().locals.get(0).print()"
<0x00007f06648fb090>
hsdb> inspect 0x00007f06648fb090
instance of Oop for java/net/SocketInputStream @ 0x00007f06648fb090 (size = 88)
_mark: 29
_metadata._klass: InstanceKlass for java/net/SocketInputStream
fd: Oop for java/io/FileDescriptor @ 0x00007f06648fb068 Oop for java/io/FileDescriptor
path: null
channel: null
closeLock: Oop for java/lang/Object @ 0x00007f0668176858
closed: false
eof: false
impl: Oop for java/net/SocksSocketImpl @ 0x00007f0666569b08
...
CLHSDB session example
Inspect the SocksSocketImpl object that we care about
hsdb> inspect 0x00007f0666569b08
instance of Oop for java/net/SocksSocketImpl @ 0x00007f0666569b08 (size = 168)
_mark: 139618986514461
_metadata._klass: InstanceKlass for java/net/SocksSocketImpl
socket: Oop for sun/security/ssl/SSLSocketImpl @ 0x00007f06648fb0e8
serverSocket: null
fd: Oop for java/io/FileDescriptor @ 0x00007f06648fb068
address: Oop for java/net/Inet4Address @ 0x00007efc928a4e00
port: 443
localport: 36332
timeout: 0
...
Found matching port as the zombie connection,
And found “timeout = 0”
Run GDB and attach to the Java process.
Run t a a bt (short for thread all apply backtrace)
GDB example
...
Thread 100 (Thread 0x7efb1926e700 (LWP 3166)):
#0 0x00007f07417252bf in __libc_recv (fd=fd@entry=636, buf=buf@entry=0x7efb1925ca70, n=n@entry=5,
flags=flags@entry=0)
at ../sysdeps/unix/sysv/linux/x86_64/recv.c:28
#1 0x00007efb9944b25d in NET_Read (__flags=0, __n=5, __buf=0x7efb1925ca70, __fd=636) at
/usr/include/x86_64-linux-gnu/bits/socket2.h:44
#2 0x00007efb9944b25d in NET_Read (s=s@entry=636, buf=buf@entry=0x7efb1925ca70, len=len@entry=5)
at
/build/openjdk-8-lTwZJE/openjdk-8-8u181-b13/src/jdk/src/solaris/native/java/net/linux_close.c:273
#3 0x00007efb9944ab8e in Java_java_net_SocketInputStream_socketRead0 (env=0x7efb941859e0,
this=<optimized out>, fdObj=<optimized out>, data=0x7efb1926cad8, off=0, len=5, timeout=0)
...
Conclusion
▪ Use Spark UI to identify at what stage a query is running, which tasks
are still running (or have gotten stuck)
▪ Use Spark UI to get a thread dump on the executor running the stuck
task to get an idea of what it’s doing
▪ When a task seem to be stuck on network I/O, use netstat to check if
there are any connections in a bad state
▪ It’s possible to introspect JVM state (thread stacks and heap) via
CLHSDB (or jhsdb, officially supported since JDK9)
▪ Native stack frame state can be introspected via GDB
P.S. This particular bug was caused by JDK-8238579
Q&A

Contenu connexe

Tendances

Tendances (20)

High-speed Database Throughput Using Apache Arrow Flight SQL
High-speed Database Throughput Using Apache Arrow Flight SQLHigh-speed Database Throughput Using Apache Arrow Flight SQL
High-speed Database Throughput Using Apache Arrow Flight SQL
 
Apache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & InternalsApache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & Internals
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization Opportunities
 
Optimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL JoinsOptimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL Joins
 
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in SparkSpark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
 
Introduction to apache spark
Introduction to apache spark Introduction to apache spark
Introduction to apache spark
 
Building a SIMD Supported Vectorized Native Engine for Spark SQL
Building a SIMD Supported Vectorized Native Engine for Spark SQLBuilding a SIMD Supported Vectorized Native Engine for Spark SQL
Building a SIMD Supported Vectorized Native Engine for Spark SQL
 
Deep Dive into the New Features of Apache Spark 3.0
Deep Dive into the New Features of Apache Spark 3.0Deep Dive into the New Features of Apache Spark 3.0
Deep Dive into the New Features of Apache Spark 3.0
 
Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture
 
Deep Dive: Memory Management in Apache Spark
Deep Dive: Memory Management in Apache SparkDeep Dive: Memory Management in Apache Spark
Deep Dive: Memory Management in Apache Spark
 
Data profiling in Apache Calcite
Data profiling in Apache CalciteData profiling in Apache Calcite
Data profiling in Apache Calcite
 
Apache Flink internals
Apache Flink internalsApache Flink internals
Apache Flink internals
 
The Pushdown of Everything by Stephan Kessler and Santiago Mola
The Pushdown of Everything by Stephan Kessler and Santiago MolaThe Pushdown of Everything by Stephan Kessler and Santiago Mola
The Pushdown of Everything by Stephan Kessler and Santiago Mola
 
Spark shuffle introduction
Spark shuffle introductionSpark shuffle introduction
Spark shuffle introduction
 
Dive into PySpark
Dive into PySparkDive into PySpark
Dive into PySpark
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guide
 
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
 
Common Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta LakehouseCommon Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta Lakehouse
 
Spark 2.x Troubleshooting Guide
Spark 2.x Troubleshooting GuideSpark 2.x Troubleshooting Guide
Spark 2.x Troubleshooting Guide
 
Deploying Flink on Kubernetes - David Anderson
 Deploying Flink on Kubernetes - David Anderson Deploying Flink on Kubernetes - David Anderson
Deploying Flink on Kubernetes - David Anderson
 

Similaire à Tracing the Breadcrumbs: Apache Spark Workload Diagnostics

Similaire à Tracing the Breadcrumbs: Apache Spark Workload Diagnostics (20)

Spark Summit EU talk by Luca Canali
Spark Summit EU talk by Luca CanaliSpark Summit EU talk by Luca Canali
Spark Summit EU talk by Luca Canali
 
Profiling & Testing with Spark
Profiling & Testing with SparkProfiling & Testing with Spark
Profiling & Testing with Spark
 
ETL with SPARK - First Spark London meetup
ETL with SPARK - First Spark London meetupETL with SPARK - First Spark London meetup
ETL with SPARK - First Spark London meetup
 
[262] netflix 빅데이터 플랫폼
[262] netflix 빅데이터 플랫폼[262] netflix 빅데이터 플랫폼
[262] netflix 빅데이터 플랫폼
 
Degrading Performance? You Might be Suffering From the Small Files Syndrome
Degrading Performance? You Might be Suffering From the Small Files SyndromeDegrading Performance? You Might be Suffering From the Small Files Syndrome
Degrading Performance? You Might be Suffering From the Small Files Syndrome
 
Spark on Yarn @ Netflix
Spark on Yarn @ NetflixSpark on Yarn @ Netflix
Spark on Yarn @ Netflix
 
Producing Spark on YARN for ETL
Producing Spark on YARN for ETLProducing Spark on YARN for ETL
Producing Spark on YARN for ETL
 
Typesafe spark- Zalando meetup
Typesafe spark- Zalando meetupTypesafe spark- Zalando meetup
Typesafe spark- Zalando meetup
 
De Java 8 a Java 17
De Java 8 a Java 17De Java 8 a Java 17
De Java 8 a Java 17
 
Tales from the four-comma club: Managing Kafka as a service at Salesforce | L...
Tales from the four-comma club: Managing Kafka as a service at Salesforce | L...Tales from the four-comma club: Managing Kafka as a service at Salesforce | L...
Tales from the four-comma club: Managing Kafka as a service at Salesforce | L...
 
Running Apache Spark on Kubernetes: Best Practices and Pitfalls
Running Apache Spark on Kubernetes: Best Practices and PitfallsRunning Apache Spark on Kubernetes: Best Practices and Pitfalls
Running Apache Spark on Kubernetes: Best Practices and Pitfalls
 
Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...
Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...
Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...
 
Stream Processing using Apache Spark and Apache Kafka
Stream Processing using Apache Spark and Apache KafkaStream Processing using Apache Spark and Apache Kafka
Stream Processing using Apache Spark and Apache Kafka
 
Healthcare Claim Reimbursement using Apache Spark
Healthcare Claim Reimbursement using Apache SparkHealthcare Claim Reimbursement using Apache Spark
Healthcare Claim Reimbursement using Apache Spark
 
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...
 
Leveraging Databricks for Spark Pipelines
Leveraging Databricks for Spark PipelinesLeveraging Databricks for Spark Pipelines
Leveraging Databricks for Spark Pipelines
 
Leveraging Databricks for Spark pipelines
Leveraging Databricks for Spark pipelinesLeveraging Databricks for Spark pipelines
Leveraging Databricks for Spark pipelines
 
Faster Data Integration Pipeline Execution using Spark-Jobserver
Faster Data Integration Pipeline Execution using Spark-JobserverFaster Data Integration Pipeline Execution using Spark-Jobserver
Faster Data Integration Pipeline Execution using Spark-Jobserver
 
Serverless on OpenStack with Docker Swarm, Mistral, and StackStorm
Serverless on OpenStack with Docker Swarm, Mistral, and StackStormServerless on OpenStack with Docker Swarm, Mistral, and StackStorm
Serverless on OpenStack with Docker Swarm, Mistral, and StackStorm
 
Dissecting Open Source Cloud Evolution: An OpenStack Case Study
Dissecting Open Source Cloud Evolution: An OpenStack Case StudyDissecting Open Source Cloud Evolution: An OpenStack Case Study
Dissecting Open Source Cloud Evolution: An OpenStack Case Study
 

Plus de Databricks

Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
Databricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 

Plus de Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 

Dernier

Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
JohnnyPlasten
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
shivangimorya083
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
Lars Albertsson
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
shivangimorya083
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
shivangimorya083
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
shambhavirathore45
 

Dernier (20)

Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 

Tracing the Breadcrumbs: Apache Spark Workload Diagnostics

  • 1. Tracing the breadcrumbs: Spark workload diagnostics Kris Mok @rednaxelafx Cheng Lian @liancheng
  • 2. About us ▪ Sr. software engineer at Databricks ▪ Worked on OpenJDK HotSpot VM & Zing VM implementations Cheng Lian ▪ Sr. software engineer at Databricks ▪ Apache Spark PMC member ▪ Apache Parquet committer Kris Mok
  • 3. Unified data analytics platform for accelerating innovation across data science, data engineering, and business analytics Original creators of popular data and machine learning open source projects Global company with 5,000 customers and 450+ partners
  • 4. Distributed applications are hard ▪ Have you ever hit the following categories of issues in your distributed Spark applications? ▪ Mysterious performance regression ▪ Mysterious job hang
  • 5. Distributed applications are hard ▪ Distributed applications are inherently hard to ▪ Develop ▪ Tune ▪ Diagnose ▪ Due to ▪ Longer iteration cycle ▪ Variety of input data (volume, quality, distribution, etc.) ▪ Broader range of infra/external dependencies (network, cloud services, etc.) ▪ Fractured criminal scenes (incomplete and scattered logs)
  • 6. What this talk is about ▪ Demonstrate tools and methodologies for diagnosing distributed Spark applications with real world use cases.
  • 8. A query runs slower in a newer version of Spark than in an older version. Check the Spark SQL query plan, ▪ If the query plan has changed ▪ the regression may likely be from an Catalyst optimizer change ▪ If the query plan is the same ▪ the regression could be coming from various sources ▪ More time spent in query optimization / compilation? ▪ More time spent in scheduling? ▪ More time spent in network operations? ▪ More time spent on task execution? ▪ More time spent on GC? ▪ ... Symptom
  • 9. Tools and methodologies demonstrated ▪ Build a benchmark to reproduce the performance regression ▪ Gather performance data using a profiler ▪ Conventional JVM profilers (JProfiler, YourKit, jvisualvm, etc.) ▪ Java Flight Recorder ▪ async-profiler and Flame Graph
  • 10. Case study ▪ A performance regression in Spark 2.4 development
  • 11. Symptom Databricks’ Spark Benchmarking team’s performance sign-off for DBR 5.0-beta had found a significant performance regression vs DBR 4.3. Multiple TPC-DS queries were slower, e.g. q67
  • 12. FlameGraph: DBR 4.3 (Spark 2.3-based)
  • 13. FlameGraph: DBR 5.0-beta (Spark 2.4-SNAPSHOT-based)
  • 14. Zoom in on the difference in hot spot DBR 4.3 DBR 5.0-beta
  • 15. Zoom in on the difference in hot spot DBR 4.3 hot loop calling monomorphic function; also avoided extra buffer copy hot loop calling polymorphic function; extra buffer copyDBR 5.0-beta
  • 18. Tools and methodologies demonstrated ▪ Thread dump from Spark UI ▪ Network debugging ▪ JVM debugging ▪ Log exploration and visualization
  • 19. Case study 1 ▪ Shuffle fetch on a dead executor causes entire cluster hung The customer found there are a lot of exceptions in the Spark logs: org.apache.spark.rpc.RpcTimeoutException: Cannot receive any reply from null in 120 seconds. During about the same time (16:35 - 16:45) when the exceptions happened, the entire cluster hung (extremely low cluster usage). ”
  • 20. Early triaging questions ▪ Anything special happened before the cluster hang? ▪ That might be the cause of the hang. ▪ Anything happening while the cluster hung? ▪ Is it completely silent, or ▪ Is it busy doing something?
  • 21. Tools ▪ Spark History Server ▪ Executor, job, stage, and task events visualization ▪ Spark logs in Delta Lake ▪ Within Databricks, with the consent of our customers, we ETL Spark logs into Delta tables for fast exploration ▪ Interactive notebook environment for log exploration and visualization
  • 22. Spark History Server - Historical Spark UI ▪ Executor 29 was removed around 16:36, right before the cluster hung ▪ Executor 103 was added around 16:43, right before the cluster went back active Executor and job events Cluster hanging
  • 23. Spark logs exploration and visualization ▪ Checking per minute driver side log message counts ▪ The driver turned out to be quiet during the cluster hang Checking driver activities Cluster hanging
  • 24. Spark logs exploration and visualization ▪ Checking per minute executor side log message counts ▪ Mild executor side activities during the cluster hang Checking executors activities Cluster hanging
  • 25. Spark logs exploration and visualization ▪ Incrementally filter out logs of various retry events and timeout events ▪ Empty result set, so no new tasks scheduled during this time ▪ But already scheduled tasks could be running quietly Zoom into the cluster hang period
  • 26. Spark logs exploration and visualization ▪ Checking per minute executor side log message counts ▪ A clear pattern repeated 3 times every 120 seconds ▪ Turned out to be executor side shuffle connection timeout events Zoom into the cluster hang period
  • 27. Conclusion ▪ The behavior is expected ▪ Later investigation revealed that ▪ A large stage consisting 2,000 tasks happened to occupy all CPU cores ▪ These tasks were waiting for shuffle map output from the failed executor until timeout ▪ So the cluster appeared to be “hanging” ▪ After executor 29 got lost, other active executors retried to connect to it for 3 times every 120 seconds, which conform to the default values of the following two Spark configurations: ▪ spark.shuffle.io.connectionTimeout ▪ spark.shuffle.io.maxRetries
  • 28. Case study 2 A customer reported that a Spark SQL query had been stuck for multiple hours, and gave permissions to do live debugging. Through Spark UI, we’ve determined that the query was almost done, and the only tasks still running were all in the final stage, writing out results.
  • 29. Get thread dump from executor via Spark UI ...
  • 30. Relevant threads’ stack traces Obviously stuck on a socket read Find executor thread via thread name
  • 31. Are there any zombie connections? Run command: netstat -o tcp6 0 0 <src_ip>:36332 <dest_ip>:https ESTABLISHED off (0.00/0/0)
  • 32. Is the thread related to the zombie connection? Introspect Java stack and objects via CLHSDB (Command-Line HotSpot Debugger) java -cp .:$JAVA_HOME/lib/sa-jdi.jar sun.jvm.hotspot.CLHSDB
  • 33. CLHSDB session example Inspect threads list (can also use jstack or pstack in CLHSDB) hsdb> jseval "jvm.threads" {Thread (address=0x00007f06648fae08, name=Attach Listener), Thread (address=0x00007f065fd538b8, name=pool-339-thread-7), Thread (address=0x00007f065fcb9cc8, name=pool-339-thread-6), Thread (address=0x00007efc9249fef0, name=pool-339-thread-5), Thread (address=0x00007efc9249e8b8, name=pool-339-thread-4), ...}
  • 34. CLHSDB session example Inspect stack frames of a given thread hsdb> jseval "jvm.threads[4].frames" {Frame (method=java.net.SocketInputStream.socketRead0(java.io.FileDescriptor, byte[], int, int, int), bci=0, line=0), Frame (method=java.net.SocketInputStream.socketRead(java.io.FileDescriptor, byte[], int, int, int), bci=8, line=116), ...}
  • 35. CLHSDB session example Inspect the receiver object of a specific frame hsdb> jseval "sa.threads.first().next().next().next().next().getLastJavaVFrameDbg().javaSender().java Sender().locals.get(0).print()" <0x00007f06648fb090> hsdb> inspect 0x00007f06648fb090 instance of Oop for java/net/SocketInputStream @ 0x00007f06648fb090 (size = 88) _mark: 29 _metadata._klass: InstanceKlass for java/net/SocketInputStream fd: Oop for java/io/FileDescriptor @ 0x00007f06648fb068 Oop for java/io/FileDescriptor path: null channel: null closeLock: Oop for java/lang/Object @ 0x00007f0668176858 closed: false eof: false impl: Oop for java/net/SocksSocketImpl @ 0x00007f0666569b08 ...
  • 36. CLHSDB session example Inspect the SocksSocketImpl object that we care about hsdb> inspect 0x00007f0666569b08 instance of Oop for java/net/SocksSocketImpl @ 0x00007f0666569b08 (size = 168) _mark: 139618986514461 _metadata._klass: InstanceKlass for java/net/SocksSocketImpl socket: Oop for sun/security/ssl/SSLSocketImpl @ 0x00007f06648fb0e8 serverSocket: null fd: Oop for java/io/FileDescriptor @ 0x00007f06648fb068 address: Oop for java/net/Inet4Address @ 0x00007efc928a4e00 port: 443 localport: 36332 timeout: 0 ... Found matching port as the zombie connection, And found “timeout = 0”
  • 37. Run GDB and attach to the Java process. Run t a a bt (short for thread all apply backtrace) GDB example ... Thread 100 (Thread 0x7efb1926e700 (LWP 3166)): #0 0x00007f07417252bf in __libc_recv (fd=fd@entry=636, buf=buf@entry=0x7efb1925ca70, n=n@entry=5, flags=flags@entry=0) at ../sysdeps/unix/sysv/linux/x86_64/recv.c:28 #1 0x00007efb9944b25d in NET_Read (__flags=0, __n=5, __buf=0x7efb1925ca70, __fd=636) at /usr/include/x86_64-linux-gnu/bits/socket2.h:44 #2 0x00007efb9944b25d in NET_Read (s=s@entry=636, buf=buf@entry=0x7efb1925ca70, len=len@entry=5) at /build/openjdk-8-lTwZJE/openjdk-8-8u181-b13/src/jdk/src/solaris/native/java/net/linux_close.c:273 #3 0x00007efb9944ab8e in Java_java_net_SocketInputStream_socketRead0 (env=0x7efb941859e0, this=<optimized out>, fdObj=<optimized out>, data=0x7efb1926cad8, off=0, len=5, timeout=0) ...
  • 38. Conclusion ▪ Use Spark UI to identify at what stage a query is running, which tasks are still running (or have gotten stuck) ▪ Use Spark UI to get a thread dump on the executor running the stuck task to get an idea of what it’s doing ▪ When a task seem to be stuck on network I/O, use netstat to check if there are any connections in a bad state ▪ It’s possible to introspect JVM state (thread stacks and heap) via CLHSDB (or jhsdb, officially supported since JDK9) ▪ Native stack frame state can be introspected via GDB P.S. This particular bug was caused by JDK-8238579
  • 39. Q&A