SlideShare une entreprise Scribd logo
1  sur  28
Télécharger pour lire hors ligne
Zhen Fan, JD.com
Yue Li, MemVerge
Optimizing Computing Cluster
Resource Utilization
with
Disaggregated Persistent Memory
#UnifiedAnalytics #SparkAISummit
2#UnifiedAnalytics #SparkAISummit
Agenda
Motivation
• Production Environment — tens of thousands of servers
• Uneven computing resource utilization in production data center
• Independent scaling of compute and storage
Extending Spark with external storage
• Current: JD’s remote shuffle service (RSS)
• The next generation: disaggregated persistent memory
Performance evaluation
Conclusion
3#UnifiedAnalytics #SparkAISummit
Computing Resource Utilization of a Production Cluster
100
20100
40100
60100
80100
100100
120100
140100
160100
8:30
8:53
9:15
9:38
10:00
10:23
10:45
11:08
11:30
11:53
12:15
CPU(Cores)
Time
Allocated VCores Total VCores
100.00
200.00
300.00
400.00
500.00
600.00
700.00
8:30
8:53
9:15
9:38
10:00
10:23
10:45
11:08
11:30
11:53
12:15
Memory(TB)
Time
Allocated Memory Total Memory
• 3,700 servers from 8:30 AM to 12:30 PM
• Memory is at high level all time
4#UnifiedAnalytics #SparkAISummit
Uneven Computing Resource Utilization
-10.00%
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
MemoryOverhead(%)
Day
Memory overhead of a production computing cluster with 3,700 servers
- MemoryOverhead = MemoryUtilization(%) – CPUUtilization(%)
• Uneven computing resource utilization between CPU and memory
• Thousands of servers – impossible to change hardware
• Spark applications centric data center – need more memory
5#UnifiedAnalytics #SparkAISummit
High Memory Demand from Spark Tasks
1000
2000
3000
4000
5000
6000
0 200 400 600 800 1000 1200 1400
ExecutionTime(s)
Memory Utilization (GB)
• Spark performance highly depends on the capacity of available memory
• Machine learning and iterative jobs
• Cache and execution memory
6#UnifiedAnalytics #SparkAISummit
Too Much Shuffle Data to Store
• Shuffle-heavy applications are very common in production
• Local HDD
− slow under high pressure
− Intolerable random I/O when shuffle data heavy
− easily broken in production and no replicas
7#UnifiedAnalytics #SparkAISummit
Too Costly to Fail
• Stage-recompute is a disaster to SLA
• We need shuffle storage that is manageable and highly available
Current Solution
• Related discussion
– Spark JIRA-25299
• Separate storage from compute
– Shuffle to external storage
• The JD remote shuffle service (RSS)
− More effective and more stable for shuffle-heavy applications
8#UnifiedAnalytics #SparkAISummit
9#UnifiedAnalytics #SparkAISummit
Workflow of JD Remote Shuffle Service
• RemoteShuffleManager
10#UnifiedAnalytics #SparkAISummit
Writer & RSS Implementation
Shuffle Writer:
• Add metadata header and use protobuf to encode the sending blocks
• Partition group for effective merge in RSS
• Ack to guarantee blocks saved in HDFS
• The data path can introduce dirty data — deduplication at reducer side
RSS:
• Merge blocks of the same partition group in memory
• trade-off between merge buffer size and the lingered time
• Flush the merged buffer in synchronization
• Many other details about controlling the data flow
11#UnifiedAnalytics #SparkAISummit
Reducer Side Implementation
• Reduce task fetches related HDFS file(s)
- Files: if one crashes, other RSS make a new file for the this partition group
• Extract the key info from driver mapStatuses
- For deduplication
• Skip the partitions that is not relevant
• Deduplication
- with metadata in blocks and mapStatus
• Throttle the input streams …
12#UnifiedAnalytics #SparkAISummit
Use case – JD Data Warehouse Workload
13#UnifiedAnalytics #SparkAISummit
Use case – JD Data Warehouse Workload
To Improve Performance …
• Storage
– HDFS might not be performant enough?
• Network
– Netty might introduce bottleneck?
• We need better tools that help us explore
– Different storage backend
– Different network transports
14#UnifiedAnalytics #SparkAISummit
15#UnifiedAnalytics #SparkAISummit
The Splash Shuffle Manager
• A flexible shuffle manager
• Supports user-defined storage backend
and network transport for shuffle
• Works with vanilla Spark
• JD-RSS uses HDFS-based backend
• Open source
• https://github.com/MemVerge/splash
• Sending shuffle states to external storage
• Shuffle index
• Data partition
• Spill
The Next Generation Architecture
• JD remote shuffle service (RSS)
– Only supports shuffle workload
– Uses external storage that is relatively slow (HDFS)
• A more general framework that separates compute and storage
– Shuffle
– RDD caching
– Goal: faster, reliable and elastic
• Our solution
• Memory extension via external memory pool
16#UnifiedAnalytics #SparkAISummit
17#UnifiedAnalytics #SparkAISummit
Extending Spark Memory via Remote Memory
RDD caching
Shuffle
Spark Memory Management Model
Image source: https://0x0fff.com/spark-memory-management/
External Memory Pool
• High capacity
• High performance
• High endurance
• Affordable
18#UnifiedAnalytics #SparkAISummit
Intel Optane DC Persistent Memory: An Excellent Candidate
• Jointly developed by Intel and Micron
• High density (up to 6TB per 2-socket server)
• High endurance
• Low latency: ns level
• Byte-addressable, can be used as main memory
• Non-volatile, can be used as primary storage
• Half the price of DRAM
19#UnifiedAnalytics #SparkAISummit
Spark with Disaggregated Persistent Memory
External Extended
Memory Node
DDR-4
pmem
Node #1
DDR-4
Persistence,
Shuffling & Spill
Data
Persistence,
Shuffling &Spill
Data
RDMA
Spark Executor
Spark Executor
Node #N
DDR-4
Persistence,
Shuffling & Spill
Data
Spark Executor
TOR
RDMA
Spark Compute Node Spark Compute Node
Fast network
20#UnifiedAnalytics #SparkAISummit
Summary
Spark with disaggregated persistent memory
• A dedicated remote PMEM pool
• Easier to manage
• Affordable
• Minimal changes
• No change for existing computing nodes
• No change for user application
• Minimal at Spark level
• Highly elastic
• Computing nodes become stateless
• Highly performant
21#UnifiedAnalytics #SparkAISummit
Workloads
TeraSort
• A synthetic shuffle intensive benchmark well suited to evaluate the I/O performance
Core service of data warehouse application
• Spark-SQL
• A core data warehouse task at JD.com, supporting business decision
• I/O-intensive, based on select, insert, and full outer join operations
Anti-Price-Crawling Service
• A core analytic task of JD.com that defends against coordinated price crawling
• It is both I/O intensive and computing intensive
22#UnifiedAnalytics #SparkAISummit
Workload Characteristics
TeraSort
Data Warehouse Anti-Price-
Crawling
Input Data size 600 GB 200 GB 726.9 GB
Cached RDD size N/A N/A 349 GB
Shuffle size 334.2 GB 234.7 GB 57.6 GB
Experimental Setup
• 10 Spark computing nodes
• 1 external persistent memory node
• Spark 2.3 with standalone mode
• HDFS 2.7
23#UnifiedAnalytics #SparkAISummit
Ethernet Switch
PMEM Pool
24#UnifiedAnalytics #SparkAISummit
Performance – TeraSort
Low executor memory scenario:
• Execution time ⇓300-400s (31%)
Medium executor memory scenario:
• Execution time ⇓300-500s (38%)
Large executor memory scenario:
• Execution time ⇓450s (35%) 700
850
1000
1150
1300
1450
1600
1750
400 600 800 1000 1200 1400 1600
ExecutionTime(s)
Memory Utilization (GB)
Spark
Optimal Spark
Spark w remote pmem
Low Medium Large
25#UnifiedAnalytics #SparkAISummit
Performance – Data Warehouse
Low executor memory scenario:
• Execution time ⇓700-800s (58%)
Medium executor memory scenario:
• Execution time ⇓550-650s (54%)
Large executor memory scenario:
• Execution time ⇓350-400s (42%)
400
600
800
1000
1200
1400
1600
100 300 500 700 900 1100 1300 1500 1700
ExecutionTime(s)
Memory Utilization (GB)
Spark
Optimal Spark
Spark w remote pmem
Low Medium Large
26#UnifiedAnalytics #SparkAISummit
Performance – Anti-Price-Crawling
Low executor memory scenario:
• Execution time ⇓500s (17%)
Medium executor memory scenario:
• Execution time ⇓1200s (40%)
Large executor memory scenario:
• Execution time ⇑0-60s (4%)
• Baseline → process local caching data
• Spark w remote PMEM→ remote PMEM
caching data
1000
2000
3000
4000
5000
6000
0 200 400 600 800 1000 1200 1400 1600 1800
ExecutionTime(s)
Memory Utilization (GB)
Spark
Optimal Spark
Spark w remote pmem
Low Medium Large
Conclusion
• The separation of compute and storage
– JD-RSS
• Shuffle to external storage pool based on HDFS
– Disaggregated persistent memory pool
• Storage memory extension for shuffle and RDD caching
– Unified under MemVerge Splash Shuffle Manager
• Better elasticity and reliability
• High capacity, high performance and affordable
• Persistent memory will bring fundamental changes to Big Data
infrastructure
27#UnifiedAnalytics #SparkAISummit
DON’T FORGET TO RATE
AND REVIEW THE SESSIONS
SEARCH SPARK + AI SUMMIT

Contenu connexe

Tendances

Indexing with MongoDB
Indexing with MongoDBIndexing with MongoDB
Indexing with MongoDBMongoDB
 
Improving Apache Spark by Taking Advantage of Disaggregated Architecture
 Improving Apache Spark by Taking Advantage of Disaggregated Architecture Improving Apache Spark by Taking Advantage of Disaggregated Architecture
Improving Apache Spark by Taking Advantage of Disaggregated ArchitectureDatabricks
 
MongoDB San Francisco 2013: Hash-based Sharding in MongoDB 2.4 presented by B...
MongoDB San Francisco 2013: Hash-based Sharding in MongoDB 2.4 presented by B...MongoDB San Francisco 2013: Hash-based Sharding in MongoDB 2.4 presented by B...
MongoDB San Francisco 2013: Hash-based Sharding in MongoDB 2.4 presented by B...MongoDB
 
Sharding Methods for MongoDB
Sharding Methods for MongoDBSharding Methods for MongoDB
Sharding Methods for MongoDBMongoDB
 
MongoDB WiredTiger Internals
MongoDB WiredTiger InternalsMongoDB WiredTiger Internals
MongoDB WiredTiger InternalsNorberto Leite
 
Spark shuffle introduction
Spark shuffle introductionSpark shuffle introduction
Spark shuffle introductioncolorant
 
Introduction to Spark Internals
Introduction to Spark InternalsIntroduction to Spark Internals
Introduction to Spark InternalsPietro Michiardi
 
Fast querying indexing for performance (4)
Fast querying   indexing for performance (4)Fast querying   indexing for performance (4)
Fast querying indexing for performance (4)MongoDB
 
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital KediaTuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital KediaDatabricks
 
MongoDB Administration 101
MongoDB Administration 101MongoDB Administration 101
MongoDB Administration 101MongoDB
 
Apache Sentry for Hadoop security
Apache Sentry for Hadoop securityApache Sentry for Hadoop security
Apache Sentry for Hadoop securitybigdatagurus_meetup
 
MyRocks introduction and production deployment
MyRocks introduction and production deploymentMyRocks introduction and production deployment
MyRocks introduction and production deploymentYoshinori Matsunobu
 
Everything You Need to Know About Sharding
Everything You Need to Know About ShardingEverything You Need to Know About Sharding
Everything You Need to Know About ShardingMongoDB
 
8a. How To Setup HBase with Docker
8a. How To Setup HBase with Docker8a. How To Setup HBase with Docker
8a. How To Setup HBase with DockerFabio Fumarola
 
When Apache Spark Meets TiDB with Xiaoyu Ma
When Apache Spark Meets TiDB with Xiaoyu MaWhen Apache Spark Meets TiDB with Xiaoyu Ma
When Apache Spark Meets TiDB with Xiaoyu MaDatabricks
 

Tendances (20)

Indexing with MongoDB
Indexing with MongoDBIndexing with MongoDB
Indexing with MongoDB
 
Improving Apache Spark by Taking Advantage of Disaggregated Architecture
 Improving Apache Spark by Taking Advantage of Disaggregated Architecture Improving Apache Spark by Taking Advantage of Disaggregated Architecture
Improving Apache Spark by Taking Advantage of Disaggregated Architecture
 
MongoDB San Francisco 2013: Hash-based Sharding in MongoDB 2.4 presented by B...
MongoDB San Francisco 2013: Hash-based Sharding in MongoDB 2.4 presented by B...MongoDB San Francisco 2013: Hash-based Sharding in MongoDB 2.4 presented by B...
MongoDB San Francisco 2013: Hash-based Sharding in MongoDB 2.4 presented by B...
 
Sharding Methods for MongoDB
Sharding Methods for MongoDBSharding Methods for MongoDB
Sharding Methods for MongoDB
 
MongoDB WiredTiger Internals
MongoDB WiredTiger InternalsMongoDB WiredTiger Internals
MongoDB WiredTiger Internals
 
Apache spark
Apache sparkApache spark
Apache spark
 
Spark shuffle introduction
Spark shuffle introductionSpark shuffle introduction
Spark shuffle introduction
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Introduction to Spark Internals
Introduction to Spark InternalsIntroduction to Spark Internals
Introduction to Spark Internals
 
Fast querying indexing for performance (4)
Fast querying   indexing for performance (4)Fast querying   indexing for performance (4)
Fast querying indexing for performance (4)
 
Introduction to DPDK RIB library
Introduction to DPDK RIB libraryIntroduction to DPDK RIB library
Introduction to DPDK RIB library
 
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital KediaTuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
 
MongoDB Administration 101
MongoDB Administration 101MongoDB Administration 101
MongoDB Administration 101
 
Spark on yarn
Spark on yarnSpark on yarn
Spark on yarn
 
Sqoop
SqoopSqoop
Sqoop
 
Apache Sentry for Hadoop security
Apache Sentry for Hadoop securityApache Sentry for Hadoop security
Apache Sentry for Hadoop security
 
MyRocks introduction and production deployment
MyRocks introduction and production deploymentMyRocks introduction and production deployment
MyRocks introduction and production deployment
 
Everything You Need to Know About Sharding
Everything You Need to Know About ShardingEverything You Need to Know About Sharding
Everything You Need to Know About Sharding
 
8a. How To Setup HBase with Docker
8a. How To Setup HBase with Docker8a. How To Setup HBase with Docker
8a. How To Setup HBase with Docker
 
When Apache Spark Meets TiDB with Xiaoyu Ma
When Apache Spark Meets TiDB with Xiaoyu MaWhen Apache Spark Meets TiDB with Xiaoyu Ma
When Apache Spark Meets TiDB with Xiaoyu Ma
 

Similaire à Optimizing Performance and Computing Resource Efficiency of In-Memory Big Data Analytics with Disaggregated Persistent Memory

Tackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-Premise
Tackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-PremiseTackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-Premise
Tackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-PremiseDatabricks
 
Apache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudApache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudRose Toomey
 
Apache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudApache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudDatabricks
 
Elastify Cloud-Native Spark Application with Persistent Memory
Elastify Cloud-Native Spark Application with Persistent MemoryElastify Cloud-Native Spark Application with Persistent Memory
Elastify Cloud-Native Spark Application with Persistent MemoryDatabricks
 
Accelerating Shuffle: A Tailor-Made RDMA Solution for Apache Spark with Yuval...
Accelerating Shuffle: A Tailor-Made RDMA Solution for Apache Spark with Yuval...Accelerating Shuffle: A Tailor-Made RDMA Solution for Apache Spark with Yuval...
Accelerating Shuffle: A Tailor-Made RDMA Solution for Apache Spark with Yuval...Spark Summit
 
Spectrum Scale Memory Usage
Spectrum Scale Memory UsageSpectrum Scale Memory Usage
Spectrum Scale Memory UsageTomer Perry
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionSplunk
 
Apache Spark AI Use Case in Telco: Network Quality Analysis and Prediction wi...
Apache Spark AI Use Case in Telco: Network Quality Analysis and Prediction wi...Apache Spark AI Use Case in Telco: Network Quality Analysis and Prediction wi...
Apache Spark AI Use Case in Telco: Network Quality Analysis and Prediction wi...Databricks
 
In-memory Data Management Trends & Techniques
In-memory Data Management Trends & TechniquesIn-memory Data Management Trends & Techniques
In-memory Data Management Trends & TechniquesHazelcast
 
Accelerating Apache Spark Shuffle for Data Analytics on the Cloud with Remote...
Accelerating Apache Spark Shuffle for Data Analytics on the Cloud with Remote...Accelerating Apache Spark Shuffle for Data Analytics on the Cloud with Remote...
Accelerating Apache Spark Shuffle for Data Analytics on the Cloud with Remote...Databricks
 
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...Databricks
 
DevOps for ETL processing at scale with MongoDB, Solr, AWS and Chef
DevOps for ETL processing at scale with MongoDB, Solr, AWS and ChefDevOps for ETL processing at scale with MongoDB, Solr, AWS and Chef
DevOps for ETL processing at scale with MongoDB, Solr, AWS and ChefGaurav "GP" Pal
 
stackArmor presentation for DevOpsDC ver 4
stackArmor presentation for DevOpsDC ver 4stackArmor presentation for DevOpsDC ver 4
stackArmor presentation for DevOpsDC ver 4Gaurav "GP" Pal
 
Caching Methodology & Strategies
Caching Methodology & StrategiesCaching Methodology & Strategies
Caching Methodology & StrategiesTiệp Vũ
 
Caching methodology and strategies
Caching methodology and strategiesCaching methodology and strategies
Caching methodology and strategiesTiep Vu
 
Architecture at Scale
Architecture at ScaleArchitecture at Scale
Architecture at ScaleElasticsearch
 
High Performance Hardware for Data Analysis
High Performance Hardware for Data AnalysisHigh Performance Hardware for Data Analysis
High Performance Hardware for Data AnalysisMike Pittaro
 
Mike Pittaro - High Performance Hardware for Data Analysis
Mike Pittaro - High Performance Hardware for Data Analysis Mike Pittaro - High Performance Hardware for Data Analysis
Mike Pittaro - High Performance Hardware for Data Analysis PyData
 
Deploying ssd in the data center 2014
Deploying ssd in the data center 2014Deploying ssd in the data center 2014
Deploying ssd in the data center 2014Howard Marks
 

Similaire à Optimizing Performance and Computing Resource Efficiency of In-Memory Big Data Analytics with Disaggregated Persistent Memory (20)

Tackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-Premise
Tackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-PremiseTackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-Premise
Tackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-Premise
 
Apache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudApache Spark At Scale in the Cloud
Apache Spark At Scale in the Cloud
 
Apache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudApache Spark At Scale in the Cloud
Apache Spark At Scale in the Cloud
 
Elastify Cloud-Native Spark Application with Persistent Memory
Elastify Cloud-Native Spark Application with Persistent MemoryElastify Cloud-Native Spark Application with Persistent Memory
Elastify Cloud-Native Spark Application with Persistent Memory
 
Accelerating Shuffle: A Tailor-Made RDMA Solution for Apache Spark with Yuval...
Accelerating Shuffle: A Tailor-Made RDMA Solution for Apache Spark with Yuval...Accelerating Shuffle: A Tailor-Made RDMA Solution for Apache Spark with Yuval...
Accelerating Shuffle: A Tailor-Made RDMA Solution for Apache Spark with Yuval...
 
Spectrum Scale Memory Usage
Spectrum Scale Memory UsageSpectrum Scale Memory Usage
Spectrum Scale Memory Usage
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout Session
 
Apache Spark AI Use Case in Telco: Network Quality Analysis and Prediction wi...
Apache Spark AI Use Case in Telco: Network Quality Analysis and Prediction wi...Apache Spark AI Use Case in Telco: Network Quality Analysis and Prediction wi...
Apache Spark AI Use Case in Telco: Network Quality Analysis and Prediction wi...
 
In-memory Data Management Trends & Techniques
In-memory Data Management Trends & TechniquesIn-memory Data Management Trends & Techniques
In-memory Data Management Trends & Techniques
 
Accelerating Apache Spark Shuffle for Data Analytics on the Cloud with Remote...
Accelerating Apache Spark Shuffle for Data Analytics on the Cloud with Remote...Accelerating Apache Spark Shuffle for Data Analytics on the Cloud with Remote...
Accelerating Apache Spark Shuffle for Data Analytics on the Cloud with Remote...
 
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
 
DevOps for ETL processing at scale with MongoDB, Solr, AWS and Chef
DevOps for ETL processing at scale with MongoDB, Solr, AWS and ChefDevOps for ETL processing at scale with MongoDB, Solr, AWS and Chef
DevOps for ETL processing at scale with MongoDB, Solr, AWS and Chef
 
stackArmor presentation for DevOpsDC ver 4
stackArmor presentation for DevOpsDC ver 4stackArmor presentation for DevOpsDC ver 4
stackArmor presentation for DevOpsDC ver 4
 
Coa presentation3
Coa presentation3Coa presentation3
Coa presentation3
 
Caching Methodology & Strategies
Caching Methodology & StrategiesCaching Methodology & Strategies
Caching Methodology & Strategies
 
Caching methodology and strategies
Caching methodology and strategiesCaching methodology and strategies
Caching methodology and strategies
 
Architecture at Scale
Architecture at ScaleArchitecture at Scale
Architecture at Scale
 
High Performance Hardware for Data Analysis
High Performance Hardware for Data AnalysisHigh Performance Hardware for Data Analysis
High Performance Hardware for Data Analysis
 
Mike Pittaro - High Performance Hardware for Data Analysis
Mike Pittaro - High Performance Hardware for Data Analysis Mike Pittaro - High Performance Hardware for Data Analysis
Mike Pittaro - High Performance Hardware for Data Analysis
 
Deploying ssd in the data center 2014
Deploying ssd in the data center 2014Deploying ssd in the data center 2014
Deploying ssd in the data center 2014
 

Plus de Databricks

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDatabricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Databricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of HadoopDatabricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDatabricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceDatabricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringDatabricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixDatabricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationDatabricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchDatabricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesDatabricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesDatabricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsDatabricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkDatabricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkDatabricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesDatabricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkDatabricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeDatabricks
 

Plus de Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 

Dernier

CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Onlineanilsa9823
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxolyaivanovalion
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlkumarajju5765
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 

Dernier (20)

CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 

Optimizing Performance and Computing Resource Efficiency of In-Memory Big Data Analytics with Disaggregated Persistent Memory

  • 1. Zhen Fan, JD.com Yue Li, MemVerge Optimizing Computing Cluster Resource Utilization with Disaggregated Persistent Memory #UnifiedAnalytics #SparkAISummit
  • 2. 2#UnifiedAnalytics #SparkAISummit Agenda Motivation • Production Environment — tens of thousands of servers • Uneven computing resource utilization in production data center • Independent scaling of compute and storage Extending Spark with external storage • Current: JD’s remote shuffle service (RSS) • The next generation: disaggregated persistent memory Performance evaluation Conclusion
  • 3. 3#UnifiedAnalytics #SparkAISummit Computing Resource Utilization of a Production Cluster 100 20100 40100 60100 80100 100100 120100 140100 160100 8:30 8:53 9:15 9:38 10:00 10:23 10:45 11:08 11:30 11:53 12:15 CPU(Cores) Time Allocated VCores Total VCores 100.00 200.00 300.00 400.00 500.00 600.00 700.00 8:30 8:53 9:15 9:38 10:00 10:23 10:45 11:08 11:30 11:53 12:15 Memory(TB) Time Allocated Memory Total Memory • 3,700 servers from 8:30 AM to 12:30 PM • Memory is at high level all time
  • 4. 4#UnifiedAnalytics #SparkAISummit Uneven Computing Resource Utilization -10.00% 0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 MemoryOverhead(%) Day Memory overhead of a production computing cluster with 3,700 servers - MemoryOverhead = MemoryUtilization(%) – CPUUtilization(%) • Uneven computing resource utilization between CPU and memory • Thousands of servers – impossible to change hardware • Spark applications centric data center – need more memory
  • 5. 5#UnifiedAnalytics #SparkAISummit High Memory Demand from Spark Tasks 1000 2000 3000 4000 5000 6000 0 200 400 600 800 1000 1200 1400 ExecutionTime(s) Memory Utilization (GB) • Spark performance highly depends on the capacity of available memory • Machine learning and iterative jobs • Cache and execution memory
  • 6. 6#UnifiedAnalytics #SparkAISummit Too Much Shuffle Data to Store • Shuffle-heavy applications are very common in production • Local HDD − slow under high pressure − Intolerable random I/O when shuffle data heavy − easily broken in production and no replicas
  • 7. 7#UnifiedAnalytics #SparkAISummit Too Costly to Fail • Stage-recompute is a disaster to SLA • We need shuffle storage that is manageable and highly available
  • 8. Current Solution • Related discussion – Spark JIRA-25299 • Separate storage from compute – Shuffle to external storage • The JD remote shuffle service (RSS) − More effective and more stable for shuffle-heavy applications 8#UnifiedAnalytics #SparkAISummit
  • 9. 9#UnifiedAnalytics #SparkAISummit Workflow of JD Remote Shuffle Service • RemoteShuffleManager
  • 10. 10#UnifiedAnalytics #SparkAISummit Writer & RSS Implementation Shuffle Writer: • Add metadata header and use protobuf to encode the sending blocks • Partition group for effective merge in RSS • Ack to guarantee blocks saved in HDFS • The data path can introduce dirty data — deduplication at reducer side RSS: • Merge blocks of the same partition group in memory • trade-off between merge buffer size and the lingered time • Flush the merged buffer in synchronization • Many other details about controlling the data flow
  • 11. 11#UnifiedAnalytics #SparkAISummit Reducer Side Implementation • Reduce task fetches related HDFS file(s) - Files: if one crashes, other RSS make a new file for the this partition group • Extract the key info from driver mapStatuses - For deduplication • Skip the partitions that is not relevant • Deduplication - with metadata in blocks and mapStatus • Throttle the input streams …
  • 12. 12#UnifiedAnalytics #SparkAISummit Use case – JD Data Warehouse Workload
  • 13. 13#UnifiedAnalytics #SparkAISummit Use case – JD Data Warehouse Workload
  • 14. To Improve Performance … • Storage – HDFS might not be performant enough? • Network – Netty might introduce bottleneck? • We need better tools that help us explore – Different storage backend – Different network transports 14#UnifiedAnalytics #SparkAISummit
  • 15. 15#UnifiedAnalytics #SparkAISummit The Splash Shuffle Manager • A flexible shuffle manager • Supports user-defined storage backend and network transport for shuffle • Works with vanilla Spark • JD-RSS uses HDFS-based backend • Open source • https://github.com/MemVerge/splash • Sending shuffle states to external storage • Shuffle index • Data partition • Spill
  • 16. The Next Generation Architecture • JD remote shuffle service (RSS) – Only supports shuffle workload – Uses external storage that is relatively slow (HDFS) • A more general framework that separates compute and storage – Shuffle – RDD caching – Goal: faster, reliable and elastic • Our solution • Memory extension via external memory pool 16#UnifiedAnalytics #SparkAISummit
  • 17. 17#UnifiedAnalytics #SparkAISummit Extending Spark Memory via Remote Memory RDD caching Shuffle Spark Memory Management Model Image source: https://0x0fff.com/spark-memory-management/ External Memory Pool • High capacity • High performance • High endurance • Affordable
  • 18. 18#UnifiedAnalytics #SparkAISummit Intel Optane DC Persistent Memory: An Excellent Candidate • Jointly developed by Intel and Micron • High density (up to 6TB per 2-socket server) • High endurance • Low latency: ns level • Byte-addressable, can be used as main memory • Non-volatile, can be used as primary storage • Half the price of DRAM
  • 19. 19#UnifiedAnalytics #SparkAISummit Spark with Disaggregated Persistent Memory External Extended Memory Node DDR-4 pmem Node #1 DDR-4 Persistence, Shuffling & Spill Data Persistence, Shuffling &Spill Data RDMA Spark Executor Spark Executor Node #N DDR-4 Persistence, Shuffling & Spill Data Spark Executor TOR RDMA Spark Compute Node Spark Compute Node Fast network
  • 20. 20#UnifiedAnalytics #SparkAISummit Summary Spark with disaggregated persistent memory • A dedicated remote PMEM pool • Easier to manage • Affordable • Minimal changes • No change for existing computing nodes • No change for user application • Minimal at Spark level • Highly elastic • Computing nodes become stateless • Highly performant
  • 21. 21#UnifiedAnalytics #SparkAISummit Workloads TeraSort • A synthetic shuffle intensive benchmark well suited to evaluate the I/O performance Core service of data warehouse application • Spark-SQL • A core data warehouse task at JD.com, supporting business decision • I/O-intensive, based on select, insert, and full outer join operations Anti-Price-Crawling Service • A core analytic task of JD.com that defends against coordinated price crawling • It is both I/O intensive and computing intensive
  • 22. 22#UnifiedAnalytics #SparkAISummit Workload Characteristics TeraSort Data Warehouse Anti-Price- Crawling Input Data size 600 GB 200 GB 726.9 GB Cached RDD size N/A N/A 349 GB Shuffle size 334.2 GB 234.7 GB 57.6 GB
  • 23. Experimental Setup • 10 Spark computing nodes • 1 external persistent memory node • Spark 2.3 with standalone mode • HDFS 2.7 23#UnifiedAnalytics #SparkAISummit Ethernet Switch PMEM Pool
  • 24. 24#UnifiedAnalytics #SparkAISummit Performance – TeraSort Low executor memory scenario: • Execution time ⇓300-400s (31%) Medium executor memory scenario: • Execution time ⇓300-500s (38%) Large executor memory scenario: • Execution time ⇓450s (35%) 700 850 1000 1150 1300 1450 1600 1750 400 600 800 1000 1200 1400 1600 ExecutionTime(s) Memory Utilization (GB) Spark Optimal Spark Spark w remote pmem Low Medium Large
  • 25. 25#UnifiedAnalytics #SparkAISummit Performance – Data Warehouse Low executor memory scenario: • Execution time ⇓700-800s (58%) Medium executor memory scenario: • Execution time ⇓550-650s (54%) Large executor memory scenario: • Execution time ⇓350-400s (42%) 400 600 800 1000 1200 1400 1600 100 300 500 700 900 1100 1300 1500 1700 ExecutionTime(s) Memory Utilization (GB) Spark Optimal Spark Spark w remote pmem Low Medium Large
  • 26. 26#UnifiedAnalytics #SparkAISummit Performance – Anti-Price-Crawling Low executor memory scenario: • Execution time ⇓500s (17%) Medium executor memory scenario: • Execution time ⇓1200s (40%) Large executor memory scenario: • Execution time ⇑0-60s (4%) • Baseline → process local caching data • Spark w remote PMEM→ remote PMEM caching data 1000 2000 3000 4000 5000 6000 0 200 400 600 800 1000 1200 1400 1600 1800 ExecutionTime(s) Memory Utilization (GB) Spark Optimal Spark Spark w remote pmem Low Medium Large
  • 27. Conclusion • The separation of compute and storage – JD-RSS • Shuffle to external storage pool based on HDFS – Disaggregated persistent memory pool • Storage memory extension for shuffle and RDD caching – Unified under MemVerge Splash Shuffle Manager • Better elasticity and reliability • High capacity, high performance and affordable • Persistent memory will bring fundamental changes to Big Data infrastructure 27#UnifiedAnalytics #SparkAISummit
  • 28. DON’T FORGET TO RATE AND REVIEW THE SESSIONS SEARCH SPARK + AI SUMMIT