SlideShare une entreprise Scribd logo
1  sur  25
Télécharger pour lire hors ligne
Kai Rothauge, Alex Gittens, Michael W. Mahoney
An Apache Spark ⇔ MPI Interface
#Res1SAIS
Thanks to NERSC and Cray Inc. for help and support!
2#Res1SAIS
• MPI = Message Passing Interface
• A specification for the developers and users of message passing libraries
• Message-Passing Parallel Programming Model:
• cooperative operations between processes
• data moved from address space of one process to that of another
• Dominant model in high-performance computing
• Requires installation of MPI implementation on system
• Popular implementations: MPICH, Open MPI
What is MPI?
More on MPI
3#Res1SAIS
• Generally regarded as “low-level” for purposes of distributed computing
• Efficient implementations of collective operations
• Works with distributed memory, shared memory, GPUs
• Communication between MPI processes:
• via TCP/IP sockets, or
• optimized for underlying interconnects (InfiniBand, Cray Aries, Intel Omni-Path,
etc.)
• Communicator objects connect groups of MPI processors
• Con: No fault tolerance or elasticity
• Numerical linear algebra (NLA) using Spark vs. MPI
• Why do linear algebra in Spark?
Spark for data-centric workloads and scientific
analysis
Characterization of linear algebra in Spark
Customers demand Spark; want to understand
performance concerns
4#Res1SAIS
Case Study: Spark vs. MPI
5#Res1SAIS
Case Study: Spark vs. MPI
• Numerical linear algebra (NLA) using Spark vs. MPI
• Why do linear algebra in Spark?
Pros:
Faster development, easier reuse
Simple dataset abstractions (RDDs, DataFrames, DataSets)
An entire ecosystem that can be used before and after NLA computations
Spark can take advantage of available local linear algebra codes
Automatic fault-tolerance, elasticity, out-of-core support
Con:
Classical MPI-based linear algebra implementations will be faster and
more efficient
6#Res1SAIS
Case Study: Spark vs. MPI
• Numerical linear algebra (NLA) using Spark vs. MPI
• Computations performed on NERSC supercomputer Cori Phase 1, a
Cray XC40
• 2,388 compute nodes
• 128 GB RAM/node, 32 2.3GHz Haswell cores/node
• Lustre storage system, Cray Aries interconnect
A. Gittens et al. “Matrix factorizations at scale: A comparison of scientific data analytics in
Spark and C+MPI using three case studies”, 2016 IEEE International Conference on Big Data
(Big Data), pages 204–213, Dec 2016.
7#Res1SAIS
A. Gittens et al. “Matrix factorizations at scale: A comparison of scientific data analytics in
Spark and C+MPI using three case studies”, 2016 IEEE International Conference on Big Data
(Big Data), pages 204–213, Dec 2016.
Case Study: Spark vs. MPI
• Numerical linear algebra (NLA) using Spark vs. MPI
• Matrix factorizations considered include truncated Singular Value
Decomposition (SVD)
• Data sets include
• Ocean temperature data - 2.2 TB
• Atmospheric data - 16 TB
8#Res1SAIS
Case Study: Spark vs. MPI
Rank 20 SVD of
2.2TB ocean
temperature data
9#Res1SAIS
Rank 20 SVD of
16TB atmospheric
data using 48K+
cores
Case Study: Spark vs. MPI
10#Res1SAIS
Case Study: Spark vs. MPI
Lessons learned:
• With favorable data (tall and skinny) and well-adapted algorithms, linear
algebra in Spark is 2x-26x slower than MPI when I/O is included
• Spark’s overheads:
• Can be order of magnitude higher than the actual computation times
• Anti-scale
• The gaps in performance suggest it may be better to interface with
MPI-based codes from Spark
11#Res1SAIS
• Alchemist interfaces between Apache Spark and existing or custom
MPI-based libraries for large-scale linear algebra, machine learning, etc.
• Idea:
• Use Spark for regular data analysis workflow
• When computationally intensive calculations are required, call relevant
MPI-based codes from Spark using Alchemist, send results to Spark
• Combine high productivity of Spark with high performance of MPI
12#Res1SAIS
Target users:
• Scientific community: Use Spark for analysis of large scientific datasets
by calling existing MPI-based libraries where appropriate
• Machine learning practitioners and data analysts:
• Better performance of a wide range of large-scale,
computationally intensive ML and data analysis algorithms
• For instance, SVD for principal component analysis,
recommender systems, leverage scores, etc.
13#Res1SAIS
• Alchemist: Acts as bridge between Spark and MPI-based libraries
• Alchemist-Client Interface: API for user, communicates with Alchemist
via TCP/IP sockets
• Alchemist-Library Interface: Shared object, imports MPI library,
provides generic interface for Alchemist to communicate with library
Basic Framework
14#Res1SAIS
Basic workflow:
• Spark application:
• Sends distributed dataset from RDD (IndexedRowMatrix) to Alchemist
• Tells Alchemist what MPI-based code should be called
• Alchemist loads relevant MPI-based library, calls function, sends results
to Spark
Basic Framework
15#Res1SAIS
• Alchemist can also load data from file
• Alchemist needs to store distributed data in appropriate format that can
be used by MPI-based libraries:
• Candidates: ScaLAPACK, Elemental, PLAPACK
• Alchemist currently uses Elemental, support for ScaLAPACK under
development
•
Basic Framework
16#Res1SAIS
Alchemist Architecture
17#Res1SAIS
Sample API
import alchemist.{Alchemist, AlMatrix}
import alchemist.libA.QRDecomposition // libA is sample MPI lib
// other code here ...
// sc is instance of SparkContext
val ac = new Alchemist.AlchemistContext(sc, numWorkers)
ac.registerLibrary("libA", ALIlibALocation)
// maybe other code here ...
val alA = AlMatrix(A) // A is IndexedRowMatrix
// routine returns QR factors of A as AlMatrix objects
val (alQ, alR) = QRDecomposition(alA)
// send data from Alchemist to Spark once ready
val Q = alQ.toIndexedRowMatrix() // convert AlMatrix alQ to RDD
val R = alR.toIndexedRowMatrix() // convert AlMatrix alR to RDD
// maybe other code here ...
ac.stop() // release resources once no longer required
18#Res1SAIS
Example: Matrix Multiplication
• Requires expensive shuffles in Spark, which is impractical:
• Matrices/RDDs are row-partitioned
• One matrix must be converted to be column-partitioned
• Requires an all-to-all shuffle that often fails once the matrix is distributed
19#Res1SAIS
Example: Matrix Multiplication
GB/nodes Spark+Alchemist Spark
Send (s) Multiplication (s) Receive (s) Total (s) Total (s)
0.8/1 5.90±2.17 6.60±0.07 2.19±0.58 14.68±2.69 160.31±8.89
12/1 16.66±0.88 75.69±0.42 19.43±0.45 111.78±1.26 809.31±51.90
56/2 32.50±2.88 178.68±24.8 55.83±0.37 267.02±27.38 -Failed-
144/4 69.40±1.85 171.73±0.81 66.80±3.46 307.94±4.57 -Failed-
• Generated random matrices and used same number of Spark and
Alchemist nodes
• Take-away: Spark’s matrix multiply is slow even on one executor, and
unreliable once there are more
20#Res1SAIS
Example: Truncated SVD
Use Alchemist and MLlib to get rank 20
truncated SVD
• Spark: 22 nodes; Alchemist: 8 nodes
• A: m-by-10K, where m = 5M, 2.5M,
1.25M, 625K, 312.5K
• Ran jobs for at most 30 minutes
(1800 s)
Experiment Setup
21#Res1SAIS
Example: Truncated SVD
• 2.2TB (6,177,583-by-46,752) ocean
temperature data read in from HDF5 file
Experiment Setup
• Data replicated column-wise
22#Res1SAIS
Upcoming Features
• PySpark, SparkR ⇔ MPI Interface
• Interface for Python => PySpark support
• Future work: Interface for R
• More Functionality
• Support for sparse matrices
• Support for MPI-based libraries built on ScaLAPACK
• Alchemist and Containers
• Alchemist running in Docker and Kubernetes
• Will enable Alchemist on clusters and the cloud
23#Res1SAIS
Limitations and Constraints
• Two copies of data in memory, more during a relayout during
computation
• Data transfer overhead between Spark and Alchemist when data on
different nodes
• Subject to network disruptions and overload
• MPI is not fault tolerant or elastic
• Lack of MPI-based libraries for machine learning
• No equivalent to MLlib currently available, closest is MaTEx
• On Cori, need to run Alchemist and Spark on separate nodes -> more
data transfer over interconnects -> larger overheads
24#Res1SAIS
Future Work
• Apache Spark ⇔ X Interface
• Interest in connecting Spark with other libraries for distributed computing
(e.g. Cray Chapel, Apache REEF)
• Reduce communication costs
• Exploit locality
• Reduce number of messages
• Use communication protocols designed for underlying network
infrastructure
• Run as network service
• MPI computations with (basic) fault tolerance and elasticity
25#Res1SAIS
github.com/project-alchemist/
Thanks to Cray Inc., DARPA and NSF for financial support
References
● A. Gittens, K. Rothauge, M. W. Mahoney, et al., “Alchemist: Accelerating Large-Scale Data
Analysis by offloading to High-Performance Computing Libraries”, 2018, Proceedings of the
24th ACM SIGKDD International Conference, Aug 2018, to appear
● A. Gittens, K. Rothauge, M. W. Mahoney, et al., “Alchemist: An Apache Spark ⇔ MPI
Interface”, 2018, to appear in CCPE Special Issue on Cray User Group Conference 2018

Contenu connexe

Tendances

Building a Business Logic Translation Engine with Spark Streaming for Communi...
Building a Business Logic Translation Engine with Spark Streaming for Communi...Building a Business Logic Translation Engine with Spark Streaming for Communi...
Building a Business Logic Translation Engine with Spark Streaming for Communi...
Spark Summit
 
MLconf 2022 NYC Event-Driven Machine Learning at Scale.pdf
MLconf 2022 NYC Event-Driven Machine Learning at Scale.pdfMLconf 2022 NYC Event-Driven Machine Learning at Scale.pdf
MLconf 2022 NYC Event-Driven Machine Learning at Scale.pdf
Timothy Spann
 

Tendances (20)

Reactive dashboard’s using apache spark
Reactive dashboard’s using apache sparkReactive dashboard’s using apache spark
Reactive dashboard’s using apache spark
 
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
 
Building a Business Logic Translation Engine with Spark Streaming for Communi...
Building a Business Logic Translation Engine with Spark Streaming for Communi...Building a Business Logic Translation Engine with Spark Streaming for Communi...
Building a Business Logic Translation Engine with Spark Streaming for Communi...
 
Rethinking Streaming Analytics For Scale
Rethinking Streaming Analytics For ScaleRethinking Streaming Analytics For Scale
Rethinking Streaming Analytics For Scale
 
Kafka: Journey from Just Another Software to Being a Critical Part of PayPal ...
Kafka: Journey from Just Another Software to Being a Critical Part of PayPal ...Kafka: Journey from Just Another Software to Being a Critical Part of PayPal ...
Kafka: Journey from Just Another Software to Being a Critical Part of PayPal ...
 
Data Policies for the Kafka-API with WebAssembly | Alexander Gallego, Vectorized
Data Policies for the Kafka-API with WebAssembly | Alexander Gallego, VectorizedData Policies for the Kafka-API with WebAssembly | Alexander Gallego, Vectorized
Data Policies for the Kafka-API with WebAssembly | Alexander Gallego, Vectorized
 
Data Integration
Data IntegrationData Integration
Data Integration
 
Pulsar in the Lakehouse: Apache Pulsar™ with Apache Spark™ and Delta Lake - P...
Pulsar in the Lakehouse: Apache Pulsar™ with Apache Spark™ and Delta Lake - P...Pulsar in the Lakehouse: Apache Pulsar™ with Apache Spark™ and Delta Lake - P...
Pulsar in the Lakehouse: Apache Pulsar™ with Apache Spark™ and Delta Lake - P...
 
Streaming Big Data & Analytics For Scale
Streaming Big Data & Analytics For ScaleStreaming Big Data & Analytics For Scale
Streaming Big Data & Analytics For Scale
 
Modern ETL Pipelines with Change Data Capture
Modern ETL Pipelines with Change Data CaptureModern ETL Pipelines with Change Data Capture
Modern ETL Pipelines with Change Data Capture
 
Building end to end streaming application on Spark
Building end to end streaming application on SparkBuilding end to end streaming application on Spark
Building end to end streaming application on Spark
 
Building real time Data Pipeline using Spark Streaming
Building real time Data Pipeline using Spark StreamingBuilding real time Data Pipeline using Spark Streaming
Building real time Data Pipeline using Spark Streaming
 
Cooperative Data Exploration with iPython Notebook
Cooperative Data Exploration with iPython NotebookCooperative Data Exploration with iPython Notebook
Cooperative Data Exploration with iPython Notebook
 
Real time ETL processing using Spark streaming
Real time ETL processing using Spark streamingReal time ETL processing using Spark streaming
Real time ETL processing using Spark streaming
 
Streaming Analytics with Spark, Kafka, Cassandra and Akka
Streaming Analytics with Spark, Kafka, Cassandra and AkkaStreaming Analytics with Spark, Kafka, Cassandra and Akka
Streaming Analytics with Spark, Kafka, Cassandra and Akka
 
Akka in Production - ScalaDays 2015
Akka in Production - ScalaDays 2015Akka in Production - ScalaDays 2015
Akka in Production - ScalaDays 2015
 
Pulsar summit asia 2021 apache pulsar with mqtt for edge computing
Pulsar summit asia 2021   apache pulsar with mqtt for edge computingPulsar summit asia 2021   apache pulsar with mqtt for edge computing
Pulsar summit asia 2021 apache pulsar with mqtt for edge computing
 
Reactive app using actor model & apache spark
Reactive app using actor model & apache sparkReactive app using actor model & apache spark
Reactive app using actor model & apache spark
 
Kafka connect-london-meetup-2016
Kafka connect-london-meetup-2016Kafka connect-london-meetup-2016
Kafka connect-london-meetup-2016
 
MLconf 2022 NYC Event-Driven Machine Learning at Scale.pdf
MLconf 2022 NYC Event-Driven Machine Learning at Scale.pdfMLconf 2022 NYC Event-Driven Machine Learning at Scale.pdf
MLconf 2022 NYC Event-Driven Machine Learning at Scale.pdf
 

Similaire à Alchemist: An Apache Spark <=> MPI Interface with Michael Mahoney and Kai Rothauge

Similaire à Alchemist: An Apache Spark <=> MPI Interface with Michael Mahoney and Kai Rothauge (20)

Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena EdelsonStreaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson
 
Fighting Fraud with Apache Spark
Fighting Fraud with Apache SparkFighting Fraud with Apache Spark
Fighting Fraud with Apache Spark
 
Vitalii Bondarenko - “Azure real-time analytics and kappa architecture with K...
Vitalii Bondarenko - “Azure real-time analytics and kappa architecture with K...Vitalii Bondarenko - “Azure real-time analytics and kappa architecture with K...
Vitalii Bondarenko - “Azure real-time analytics and kappa architecture with K...
 
Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Example
Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life ExampleKafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Example
Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Example
 
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
 
Processing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeekProcessing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeek
 
Extending the R API for Spark with sparklyr and Microsoft R Server with Ali Z...
Extending the R API for Spark with sparklyr and Microsoft R Server with Ali Z...Extending the R API for Spark with sparklyr and Microsoft R Server with Ali Z...
Extending the R API for Spark with sparklyr and Microsoft R Server with Ali Z...
 
Apache Spark Core
Apache Spark CoreApache Spark Core
Apache Spark Core
 
Lessons Learned: Using Spark and Microservices
Lessons Learned: Using Spark and MicroservicesLessons Learned: Using Spark and Microservices
Lessons Learned: Using Spark and Microservices
 
Kafka presentation
Kafka presentationKafka presentation
Kafka presentation
 
Big Data_Architecture.pptx
Big Data_Architecture.pptxBig Data_Architecture.pptx
Big Data_Architecture.pptx
 
2018 03 25 system ml ai and openpower meetup
2018 03 25 system ml ai and openpower meetup2018 03 25 system ml ai and openpower meetup
2018 03 25 system ml ai and openpower meetup
 
BBL KAPPA Lesfurets.com
BBL KAPPA Lesfurets.comBBL KAPPA Lesfurets.com
BBL KAPPA Lesfurets.com
 
Hadoop and Spark
Hadoop and SparkHadoop and Spark
Hadoop and Spark
 
Apache Spark - A High Level overview
Apache Spark - A High Level overviewApache Spark - A High Level overview
Apache Spark - A High Level overview
 
A Collaborative Data Science Development Workflow
A Collaborative Data Science Development WorkflowA Collaborative Data Science Development Workflow
A Collaborative Data Science Development Workflow
 
04 open source_tools
04 open source_tools04 open source_tools
04 open source_tools
 
Spark streaming high level overview
Spark streaming high level overviewSpark streaming high level overview
Spark streaming high level overview
 
Apache Spark
Apache SparkApache Spark
Apache Spark
 
Apache Spark Tutorial
Apache Spark TutorialApache Spark Tutorial
Apache Spark Tutorial
 

Plus de Databricks

Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
Databricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 

Plus de Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 

Dernier

Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
amitlee9823
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
amitlee9823
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
gajnagarg
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
only4webmaster01
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
amitlee9823
 
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
gajnagarg
 
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
gajnagarg
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 

Dernier (20)

Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
 
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 

Alchemist: An Apache Spark <=> MPI Interface with Michael Mahoney and Kai Rothauge

  • 1. Kai Rothauge, Alex Gittens, Michael W. Mahoney An Apache Spark ⇔ MPI Interface #Res1SAIS Thanks to NERSC and Cray Inc. for help and support!
  • 2. 2#Res1SAIS • MPI = Message Passing Interface • A specification for the developers and users of message passing libraries • Message-Passing Parallel Programming Model: • cooperative operations between processes • data moved from address space of one process to that of another • Dominant model in high-performance computing • Requires installation of MPI implementation on system • Popular implementations: MPICH, Open MPI What is MPI?
  • 3. More on MPI 3#Res1SAIS • Generally regarded as “low-level” for purposes of distributed computing • Efficient implementations of collective operations • Works with distributed memory, shared memory, GPUs • Communication between MPI processes: • via TCP/IP sockets, or • optimized for underlying interconnects (InfiniBand, Cray Aries, Intel Omni-Path, etc.) • Communicator objects connect groups of MPI processors • Con: No fault tolerance or elasticity
  • 4. • Numerical linear algebra (NLA) using Spark vs. MPI • Why do linear algebra in Spark? Spark for data-centric workloads and scientific analysis Characterization of linear algebra in Spark Customers demand Spark; want to understand performance concerns 4#Res1SAIS Case Study: Spark vs. MPI
  • 5. 5#Res1SAIS Case Study: Spark vs. MPI • Numerical linear algebra (NLA) using Spark vs. MPI • Why do linear algebra in Spark? Pros: Faster development, easier reuse Simple dataset abstractions (RDDs, DataFrames, DataSets) An entire ecosystem that can be used before and after NLA computations Spark can take advantage of available local linear algebra codes Automatic fault-tolerance, elasticity, out-of-core support Con: Classical MPI-based linear algebra implementations will be faster and more efficient
  • 6. 6#Res1SAIS Case Study: Spark vs. MPI • Numerical linear algebra (NLA) using Spark vs. MPI • Computations performed on NERSC supercomputer Cori Phase 1, a Cray XC40 • 2,388 compute nodes • 128 GB RAM/node, 32 2.3GHz Haswell cores/node • Lustre storage system, Cray Aries interconnect A. Gittens et al. “Matrix factorizations at scale: A comparison of scientific data analytics in Spark and C+MPI using three case studies”, 2016 IEEE International Conference on Big Data (Big Data), pages 204–213, Dec 2016.
  • 7. 7#Res1SAIS A. Gittens et al. “Matrix factorizations at scale: A comparison of scientific data analytics in Spark and C+MPI using three case studies”, 2016 IEEE International Conference on Big Data (Big Data), pages 204–213, Dec 2016. Case Study: Spark vs. MPI • Numerical linear algebra (NLA) using Spark vs. MPI • Matrix factorizations considered include truncated Singular Value Decomposition (SVD) • Data sets include • Ocean temperature data - 2.2 TB • Atmospheric data - 16 TB
  • 8. 8#Res1SAIS Case Study: Spark vs. MPI Rank 20 SVD of 2.2TB ocean temperature data
  • 9. 9#Res1SAIS Rank 20 SVD of 16TB atmospheric data using 48K+ cores Case Study: Spark vs. MPI
  • 10. 10#Res1SAIS Case Study: Spark vs. MPI Lessons learned: • With favorable data (tall and skinny) and well-adapted algorithms, linear algebra in Spark is 2x-26x slower than MPI when I/O is included • Spark’s overheads: • Can be order of magnitude higher than the actual computation times • Anti-scale • The gaps in performance suggest it may be better to interface with MPI-based codes from Spark
  • 11. 11#Res1SAIS • Alchemist interfaces between Apache Spark and existing or custom MPI-based libraries for large-scale linear algebra, machine learning, etc. • Idea: • Use Spark for regular data analysis workflow • When computationally intensive calculations are required, call relevant MPI-based codes from Spark using Alchemist, send results to Spark • Combine high productivity of Spark with high performance of MPI
  • 12. 12#Res1SAIS Target users: • Scientific community: Use Spark for analysis of large scientific datasets by calling existing MPI-based libraries where appropriate • Machine learning practitioners and data analysts: • Better performance of a wide range of large-scale, computationally intensive ML and data analysis algorithms • For instance, SVD for principal component analysis, recommender systems, leverage scores, etc.
  • 13. 13#Res1SAIS • Alchemist: Acts as bridge between Spark and MPI-based libraries • Alchemist-Client Interface: API for user, communicates with Alchemist via TCP/IP sockets • Alchemist-Library Interface: Shared object, imports MPI library, provides generic interface for Alchemist to communicate with library Basic Framework
  • 14. 14#Res1SAIS Basic workflow: • Spark application: • Sends distributed dataset from RDD (IndexedRowMatrix) to Alchemist • Tells Alchemist what MPI-based code should be called • Alchemist loads relevant MPI-based library, calls function, sends results to Spark Basic Framework
  • 15. 15#Res1SAIS • Alchemist can also load data from file • Alchemist needs to store distributed data in appropriate format that can be used by MPI-based libraries: • Candidates: ScaLAPACK, Elemental, PLAPACK • Alchemist currently uses Elemental, support for ScaLAPACK under development • Basic Framework
  • 17. 17#Res1SAIS Sample API import alchemist.{Alchemist, AlMatrix} import alchemist.libA.QRDecomposition // libA is sample MPI lib // other code here ... // sc is instance of SparkContext val ac = new Alchemist.AlchemistContext(sc, numWorkers) ac.registerLibrary("libA", ALIlibALocation) // maybe other code here ... val alA = AlMatrix(A) // A is IndexedRowMatrix // routine returns QR factors of A as AlMatrix objects val (alQ, alR) = QRDecomposition(alA) // send data from Alchemist to Spark once ready val Q = alQ.toIndexedRowMatrix() // convert AlMatrix alQ to RDD val R = alR.toIndexedRowMatrix() // convert AlMatrix alR to RDD // maybe other code here ... ac.stop() // release resources once no longer required
  • 18. 18#Res1SAIS Example: Matrix Multiplication • Requires expensive shuffles in Spark, which is impractical: • Matrices/RDDs are row-partitioned • One matrix must be converted to be column-partitioned • Requires an all-to-all shuffle that often fails once the matrix is distributed
  • 19. 19#Res1SAIS Example: Matrix Multiplication GB/nodes Spark+Alchemist Spark Send (s) Multiplication (s) Receive (s) Total (s) Total (s) 0.8/1 5.90±2.17 6.60±0.07 2.19±0.58 14.68±2.69 160.31±8.89 12/1 16.66±0.88 75.69±0.42 19.43±0.45 111.78±1.26 809.31±51.90 56/2 32.50±2.88 178.68±24.8 55.83±0.37 267.02±27.38 -Failed- 144/4 69.40±1.85 171.73±0.81 66.80±3.46 307.94±4.57 -Failed- • Generated random matrices and used same number of Spark and Alchemist nodes • Take-away: Spark’s matrix multiply is slow even on one executor, and unreliable once there are more
  • 20. 20#Res1SAIS Example: Truncated SVD Use Alchemist and MLlib to get rank 20 truncated SVD • Spark: 22 nodes; Alchemist: 8 nodes • A: m-by-10K, where m = 5M, 2.5M, 1.25M, 625K, 312.5K • Ran jobs for at most 30 minutes (1800 s) Experiment Setup
  • 21. 21#Res1SAIS Example: Truncated SVD • 2.2TB (6,177,583-by-46,752) ocean temperature data read in from HDF5 file Experiment Setup • Data replicated column-wise
  • 22. 22#Res1SAIS Upcoming Features • PySpark, SparkR ⇔ MPI Interface • Interface for Python => PySpark support • Future work: Interface for R • More Functionality • Support for sparse matrices • Support for MPI-based libraries built on ScaLAPACK • Alchemist and Containers • Alchemist running in Docker and Kubernetes • Will enable Alchemist on clusters and the cloud
  • 23. 23#Res1SAIS Limitations and Constraints • Two copies of data in memory, more during a relayout during computation • Data transfer overhead between Spark and Alchemist when data on different nodes • Subject to network disruptions and overload • MPI is not fault tolerant or elastic • Lack of MPI-based libraries for machine learning • No equivalent to MLlib currently available, closest is MaTEx • On Cori, need to run Alchemist and Spark on separate nodes -> more data transfer over interconnects -> larger overheads
  • 24. 24#Res1SAIS Future Work • Apache Spark ⇔ X Interface • Interest in connecting Spark with other libraries for distributed computing (e.g. Cray Chapel, Apache REEF) • Reduce communication costs • Exploit locality • Reduce number of messages • Use communication protocols designed for underlying network infrastructure • Run as network service • MPI computations with (basic) fault tolerance and elasticity
  • 25. 25#Res1SAIS github.com/project-alchemist/ Thanks to Cray Inc., DARPA and NSF for financial support References ● A. Gittens, K. Rothauge, M. W. Mahoney, et al., “Alchemist: Accelerating Large-Scale Data Analysis by offloading to High-Performance Computing Libraries”, 2018, Proceedings of the 24th ACM SIGKDD International Conference, Aug 2018, to appear ● A. Gittens, K. Rothauge, M. W. Mahoney, et al., “Alchemist: An Apache Spark ⇔ MPI Interface”, 2018, to appear in CCPE Special Issue on Cray User Group Conference 2018