SlideShare une entreprise Scribd logo
1  sur  28
Télécharger pour lire hors ligne
© 2014 MapR Technologies 1© 2014 MapR Technologies
Distributed Deep Learning on Spark
Mathieu Dumoulin - Data Engineer
MapR Professional Services APAC
© 2014 MapR Technologies 2
Tonight’s Presentation FAQ-Style
• Short intro on machine learning
• What’s Deep learning?
• Why distributed? Why do we need a computer cluster?
• Why run it on Spark?
• How does it work?
– Case study of SparkNet: Training Deep Networks in Spark
– Case Study of CaffeOnSpark
• Can I see a Demo?
– Installation Process
– Caffe demo
– CaffeOnSpark demo
© 2014 MapR Technologies 3
Machine Learning is all around us!
• Internet search with Google and Bing
• Contextual ads (Adsense)
• Apple iOS 9&10 (interesting link with details!)
• Google GMail/Inbox (Priority Inbox, Spam filtering)
• Fraud Detection
• Recommendations (Amazon)
• Image recognition (I can see… cats!)
• Language Modeling & Speech Recognition (Siri, Google Now,
Google Translate)
© 2016 MapR Technologies 4© 2016 MapR Technologies 4MapR Confidential
Classification of images
© 2016 MapR Technologies 5© 2016 MapR Technologies 5MapR Confidential
Why Deep Learning?
• Because they work really, really well!
• Deep learning is the state of the art in applied machine learning
– Wins in every major machine learning competition
• Kaggle
• ImageNet
• Especially well suited for:
– Images (classification, object detection, etc)
– Sounds (speech, music)
– Text (translation)
• Deep Learning is very CPU intensive
– More processing for better models
– More processing for faster training
© 2016 MapR Technologies 6© 2016 MapR Technologies 6MapR Confidential
MNIST digits task
• Classify 60,000 handwritten digits to the correct number
Taken from Wikipedia (https://en.wikipedia.org/wiki/MNIST_database)
More deep learning results: (http://yann.lecun.com/exdb/mnist/)
Type Error rate
(%)
K-Nearest Neighbors 0.52[14]
Support vector machine 0.56[16]
Deep neural network 0.35[18]
Convolutional neural
network
0.23[8]
© 2016 MapR Technologies 7© 2016 MapR Technologies 7MapR Confidential
Results are now competitive with humans!
© 2016 MapR Technologies 8© 2016 MapR Technologies 8MapR Confidential
Why Distributed
“training can be time consuming, often requiring multiple days on a
single GPU using [SGD]” - Moritz et al - SparkNet
• The most GPU for one physical node is 3-4
• A cluster can spread the CPU/GPU load at the cost of increased
complexity
• Google coded such software from scratch early 2010.
© 2016 MapR Technologies 9© 2016 MapR Technologies 9MapR Confidential
How to Distribute: Parameter Server
• Li et al propose the “Parameter Server” approach in 2014
– https://www.cs.cmu.edu/~dga/papers/osdi14-paper-li_mu.pdf
From Arimo’s Distributed TensorFlow blog post (link)
© 2016 MapR Technologies 10© 2016 MapR Technologies 10MapR Confidential
Why Spark?
• Integrates well with existing “big data” batch processing
frameworks (Hadoop/MapReduce)
• Allows data to be kept in memory from start to finish
• Work with a single computational framework
• Relatively easy to implement parameter server
© 2016 MapR Technologies 11© 2016 MapR Technologies 11MapR Confidential
New frameworks for spark-based Distributed DL
• CaffeOnSpark (Yahoo America)
• SparkNet (Berkeley University’s Amplab)
• DeepLearning4J (Skymind)
• Elephas (Keras team)
• Distributed Tensor Flow (Arimo)
© 2016 MapR Technologies 12© 2016 MapR Technologies 12MapR Confidential
SparkNet implementation
From: https://arxiv.org/pdf/1511.06051v4.pdf
© 2016 MapR Technologies 13© 2016 MapR Technologies 13MapR Confidential
SparkNet implementation 2
From: https://arxiv.org/pdf/1511.06051v4.pdf
© 2016 MapR Technologies 14© 2016 MapR Technologies 14MapR Confidential
SparkNet implementation 3
From: https://arxiv.org/pdf/1511.06051v4.pdf
© 2016 MapR Technologies 15© 2016 MapR Technologies 15MapR Confidential
We need a Solver: Caffe
● (+) Good for feedforward networks and image processing
● (+) Good for finetuning existing networks
● (+) Train models without writing any code
● (+) Python interface is pretty useful
● (-) Need to write C++ / CUDA for new GPU layers
● (-) Not good for recurrent networks
● (-) Cumbersome for big networks (GoogLeNet, ResNet)
● (-) Not extensible, bit of a hairball
● (-) No commercial support
taken from: http://deeplearning4j.org/compare-dl4j-torch7-pylearn.html#caffe
© 2016 MapR Technologies 16© 2016 MapR Technologies 16MapR Confidential
Distributed SGD and Parameter Server
From: https://arxiv.org/pdf/1511.06051v4.pdf
© 2016 MapR Technologies 17© 2016 MapR Technologies 17MapR Confidential
SparkNet’s implementation of DSGD
From: https://arxiv.org/pdf/1511.06051v4.pdf
© 2016 MapR Technologies 18© 2016 MapR Technologies 18MapR Confidential
Benefits of the approach
From: https://arxiv.org/pdf/1511.06051v4.pdf
© 2016 MapR Technologies 19© 2016 MapR Technologies 19MapR Confidential
Scaling performance of SparkNet
From: https://arxiv.org/pdf/1511.06051v4.pdf
© 2016 MapR Technologies 20© 2016 MapR Technologies 20MapR Confidential
CaffeOnSpark
• Mix Java and Scala implementation
• Developed and used in production at Yahoo America
• Much easier to install than SparkNet, less buggy
• Can take advantage of Infiniband network
• Enhanced Caffe to use multi-GPU
• CaffeOnSpark executors communicate to each other via MPI
allreduce style interface
• Spark+MPI architecture achieves similar performance as
dedicated deep learning clusters
– Peer-to-peer parameter server
• Faster than SparkNet
© 2016 MapR Technologies 21© 2016 MapR Technologies 21MapR Confidential
CaffeOnSpark System Architecture
From: http://yahoohadoop.tumblr.com/post/129872361846/large-scale-distributed-deep-learning-on-hadoop
© 2016 MapR Technologies 22© 2016 MapR Technologies 22MapR Confidential
CaffeOnSpark vs. SparkNet
• Much faster
communication between
nodes (Infiniband
capability)
• Peer-to-peer parameter
exchange model is a
much faster
implementation
• Enhanced multi-GPU
Caffe also faster
© 2016 MapR Technologies 23© 2016 MapR Technologies 23MapR Confidential
Comparison of Frameworks (Spark Summit 2016)
By Yu Cao (EMC) and Zhe Dong (EMC) (Slideshare)
© 2016 MapR Technologies 24© 2016 MapR Technologies 24MapR Confidential
Benchmark 2
By Yu Cao (EMC) and Zhe Dong (EMC) (Slideshare)
© 2016 MapR Technologies 25© 2016 MapR Technologies 25MapR Confidential
Installing CaffeOnSpark
• I recommend Centos 7 or Ubuntu 14+
• Process is very “touchy”, easy to mess up
• Go step by step!
Process:
1. Update the OS and kernel, install dev tools (gcc, etc.) reboot
a. Disable “nouveau” driver!!!
2. Install NVidia Drivers latest, Cuda 7.5, cuDNN 4
3. Install Caffe
a. Install all caffe dependencies, make sure it compiles and examples
run.
4. Install CaffeOnSpark
© 2016 MapR Technologies 26© 2016 MapR Technologies 26MapR Confidential
Installing Caffe
Good tutorials are quite few!
• Ubuntu works more “out of the box” the default paths are all
correct
• Centos7: a few changes are needed but it’s still OK
The caffe web site instructions for Centos are a bit outdated.
© 2016 MapR Technologies 27© 2016 MapR Technologies 27MapR Confidential
Demos
• Running an example on Caffe
– Caffe deep network description files
– MNIST example
• Running an example with CaffeOnSpark
– MNIST example
– running on YARN/Spark Standalone
© 2016 MapR Technologies 28© 2016 MapR Technologies 28MapR Confidential © 2016 MapR Technologies
Q&A time

Contenu connexe

Tendances

Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud ML
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud MLScaling TensorFlow Models for Training using multi-GPUs & Google Cloud ML
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud MLSeldon
 
Simplify Distributed TensorFlow Training for Fast Image Categorization at Sta...
Simplify Distributed TensorFlow Training for Fast Image Categorization at Sta...Simplify Distributed TensorFlow Training for Fast Image Categorization at Sta...
Simplify Distributed TensorFlow Training for Fast Image Categorization at Sta...Databricks
 
Kaz Sato, Evangelist, Google at MLconf ATL 2016
Kaz Sato, Evangelist, Google at MLconf ATL 2016Kaz Sato, Evangelist, Google at MLconf ATL 2016
Kaz Sato, Evangelist, Google at MLconf ATL 2016MLconf
 
Yggdrasil: Faster Decision Trees Using Column Partitioning In Spark
Yggdrasil: Faster Decision Trees Using Column Partitioning In SparkYggdrasil: Faster Decision Trees Using Column Partitioning In Spark
Yggdrasil: Faster Decision Trees Using Column Partitioning In SparkJen Aman
 
Scalable Deep Learning Platform On Spark In Baidu
Scalable Deep Learning Platform On Spark In BaiduScalable Deep Learning Platform On Spark In Baidu
Scalable Deep Learning Platform On Spark In BaiduJen Aman
 
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...Databricks
 
Distributed deep learning
Distributed deep learningDistributed deep learning
Distributed deep learningMehdi Shibahara
 
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016MLconf
 
Converged and Containerized Distributed Deep Learning With TensorFlow and Kub...
Converged and Containerized Distributed Deep Learning With TensorFlow and Kub...Converged and Containerized Distributed Deep Learning With TensorFlow and Kub...
Converged and Containerized Distributed Deep Learning With TensorFlow and Kub...Mathieu Dumoulin
 
State of the Art Robot Predictive Maintenance with Real-time Sensor Data
State of the Art Robot Predictive Maintenance with Real-time Sensor DataState of the Art Robot Predictive Maintenance with Real-time Sensor Data
State of the Art Robot Predictive Maintenance with Real-time Sensor DataMathieu Dumoulin
 
Deep Learning with DL4J on Apache Spark: Yeah it’s Cool, but are You Doing it...
Deep Learning with DL4J on Apache Spark: Yeah it’s Cool, but are You Doing it...Deep Learning with DL4J on Apache Spark: Yeah it’s Cool, but are You Doing it...
Deep Learning with DL4J on Apache Spark: Yeah it’s Cool, but are You Doing it...Databricks
 
Very large scale distributed deep learning on BigDL
Very large scale distributed deep learning on BigDLVery large scale distributed deep learning on BigDL
Very large scale distributed deep learning on BigDLDESMOND YUEN
 
Accelerating Data Science with Better Data Engineering on Databricks
Accelerating Data Science with Better Data Engineering on DatabricksAccelerating Data Science with Better Data Engineering on Databricks
Accelerating Data Science with Better Data Engineering on DatabricksDatabricks
 
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...Databricks
 
Lessons Learned while Implementing a Sparse Logistic Regression Algorithm in ...
Lessons Learned while Implementing a Sparse Logistic Regression Algorithm in ...Lessons Learned while Implementing a Sparse Logistic Regression Algorithm in ...
Lessons Learned while Implementing a Sparse Logistic Regression Algorithm in ...Spark Summit
 
A Graph-Based Method For Cross-Entity Threat Detection
 A Graph-Based Method For Cross-Entity Threat Detection A Graph-Based Method For Cross-Entity Threat Detection
A Graph-Based Method For Cross-Entity Threat DetectionJen Aman
 
Simple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform ConceptSimple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform ConceptSatish Mohan
 
Auto-Pilot for Apache Spark Using Machine Learning
Auto-Pilot for Apache Spark Using Machine LearningAuto-Pilot for Apache Spark Using Machine Learning
Auto-Pilot for Apache Spark Using Machine LearningDatabricks
 
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...Mathieu Dumoulin
 

Tendances (20)

Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud ML
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud MLScaling TensorFlow Models for Training using multi-GPUs & Google Cloud ML
Scaling TensorFlow Models for Training using multi-GPUs & Google Cloud ML
 
Simplify Distributed TensorFlow Training for Fast Image Categorization at Sta...
Simplify Distributed TensorFlow Training for Fast Image Categorization at Sta...Simplify Distributed TensorFlow Training for Fast Image Categorization at Sta...
Simplify Distributed TensorFlow Training for Fast Image Categorization at Sta...
 
Kaz Sato, Evangelist, Google at MLconf ATL 2016
Kaz Sato, Evangelist, Google at MLconf ATL 2016Kaz Sato, Evangelist, Google at MLconf ATL 2016
Kaz Sato, Evangelist, Google at MLconf ATL 2016
 
Yggdrasil: Faster Decision Trees Using Column Partitioning In Spark
Yggdrasil: Faster Decision Trees Using Column Partitioning In SparkYggdrasil: Faster Decision Trees Using Column Partitioning In Spark
Yggdrasil: Faster Decision Trees Using Column Partitioning In Spark
 
Scalable Deep Learning Platform On Spark In Baidu
Scalable Deep Learning Platform On Spark In BaiduScalable Deep Learning Platform On Spark In Baidu
Scalable Deep Learning Platform On Spark In Baidu
 
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
 
Distributed deep learning
Distributed deep learningDistributed deep learning
Distributed deep learning
 
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
 
Converged and Containerized Distributed Deep Learning With TensorFlow and Kub...
Converged and Containerized Distributed Deep Learning With TensorFlow and Kub...Converged and Containerized Distributed Deep Learning With TensorFlow and Kub...
Converged and Containerized Distributed Deep Learning With TensorFlow and Kub...
 
State of the Art Robot Predictive Maintenance with Real-time Sensor Data
State of the Art Robot Predictive Maintenance with Real-time Sensor DataState of the Art Robot Predictive Maintenance with Real-time Sensor Data
State of the Art Robot Predictive Maintenance with Real-time Sensor Data
 
Deep Learning at Scale
Deep Learning at ScaleDeep Learning at Scale
Deep Learning at Scale
 
Deep Learning with DL4J on Apache Spark: Yeah it’s Cool, but are You Doing it...
Deep Learning with DL4J on Apache Spark: Yeah it’s Cool, but are You Doing it...Deep Learning with DL4J on Apache Spark: Yeah it’s Cool, but are You Doing it...
Deep Learning with DL4J on Apache Spark: Yeah it’s Cool, but are You Doing it...
 
Very large scale distributed deep learning on BigDL
Very large scale distributed deep learning on BigDLVery large scale distributed deep learning on BigDL
Very large scale distributed deep learning on BigDL
 
Accelerating Data Science with Better Data Engineering on Databricks
Accelerating Data Science with Better Data Engineering on DatabricksAccelerating Data Science with Better Data Engineering on Databricks
Accelerating Data Science with Better Data Engineering on Databricks
 
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...
 
Lessons Learned while Implementing a Sparse Logistic Regression Algorithm in ...
Lessons Learned while Implementing a Sparse Logistic Regression Algorithm in ...Lessons Learned while Implementing a Sparse Logistic Regression Algorithm in ...
Lessons Learned while Implementing a Sparse Logistic Regression Algorithm in ...
 
A Graph-Based Method For Cross-Entity Threat Detection
 A Graph-Based Method For Cross-Entity Threat Detection A Graph-Based Method For Cross-Entity Threat Detection
A Graph-Based Method For Cross-Entity Threat Detection
 
Simple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform ConceptSimple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform Concept
 
Auto-Pilot for Apache Spark Using Machine Learning
Auto-Pilot for Apache Spark Using Machine LearningAuto-Pilot for Apache Spark Using Machine Learning
Auto-Pilot for Apache Spark Using Machine Learning
 
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
 

En vedette

TensorFrames: Google Tensorflow on Apache Spark
TensorFrames: Google Tensorflow on Apache SparkTensorFrames: Google Tensorflow on Apache Spark
TensorFrames: Google Tensorflow on Apache SparkDatabricks
 
Hadoop Summit 2014 Distributed Deep Learning
Hadoop Summit 2014 Distributed Deep LearningHadoop Summit 2014 Distributed Deep Learning
Hadoop Summit 2014 Distributed Deep LearningAdam Gibson
 
Hello Swift Final
Hello Swift FinalHello Swift Final
Hello Swift FinalCody Yun
 
Machine Learning Methods for Parameter Acquisition in a Human ...
Machine Learning Methods for Parameter Acquisition in a Human ...Machine Learning Methods for Parameter Acquisition in a Human ...
Machine Learning Methods for Parameter Acquisition in a Human ...butest
 
MALT: Distributed Data-Parallelism for Existing ML Applications (Distributed ...
MALT: Distributed Data-Parallelism for Existing ML Applications (Distributed ...MALT: Distributed Data-Parallelism for Existing ML Applications (Distributed ...
MALT: Distributed Data-Parallelism for Existing ML Applications (Distributed ...asimkadav
 
Spark Summit EU talk by Rolf Jagerman
Spark Summit EU talk by Rolf JagermanSpark Summit EU talk by Rolf Jagerman
Spark Summit EU talk by Rolf JagermanSpark Summit
 
Data Engineering 101: Building your first data product by Jonathan Dinu PyDat...
Data Engineering 101: Building your first data product by Jonathan Dinu PyDat...Data Engineering 101: Building your first data product by Jonathan Dinu PyDat...
Data Engineering 101: Building your first data product by Jonathan Dinu PyDat...PyData
 
キャンバス個人用アプリ 速習ガイド
キャンバス個人用アプリ 速習ガイドキャンバス個人用アプリ 速習ガイド
キャンバス個人用アプリ 速習ガイドKazuki Nakajima
 
Tns Profile V 12.0
Tns Profile V 12.0Tns Profile V 12.0
Tns Profile V 12.0jw
 
今さらきけない環境ハブ
今さらきけない環境ハブ今さらきけない環境ハブ
今さらきけない環境ハブKazuki Nakajima
 
Challenges on Distributed Machine Learning
Challenges on Distributed Machine LearningChallenges on Distributed Machine Learning
Challenges on Distributed Machine Learningjie cao
 
Spark Summit EU talk by Nick Pentreath
Spark Summit EU talk by Nick PentreathSpark Summit EU talk by Nick Pentreath
Spark Summit EU talk by Nick PentreathSpark Summit
 
Distributed machine learning
Distributed machine learningDistributed machine learning
Distributed machine learningStanley Wang
 
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...MLconf
 
Distributed implementation of a lstm on spark and tensorflow
Distributed implementation of a lstm on spark and tensorflowDistributed implementation of a lstm on spark and tensorflow
Distributed implementation of a lstm on spark and tensorflowEmanuel Di Nardo
 
Self-Service.AI - Pitch Competition for AI-Driven SaaS Startups
Self-Service.AI - Pitch Competition for AI-Driven SaaS StartupsSelf-Service.AI - Pitch Competition for AI-Driven SaaS Startups
Self-Service.AI - Pitch Competition for AI-Driven SaaS StartupsDatentreiber
 
Spark Based Distributed Deep Learning Framework For Big Data Applications
Spark Based Distributed Deep Learning Framework For Big Data Applications Spark Based Distributed Deep Learning Framework For Big Data Applications
Spark Based Distributed Deep Learning Framework For Big Data Applications Humoyun Ahmedov
 
FlinkML: Large Scale Machine Learning with Apache Flink
FlinkML: Large Scale Machine Learning with Apache FlinkFlinkML: Large Scale Machine Learning with Apache Flink
FlinkML: Large Scale Machine Learning with Apache FlinkTheodoros Vasiloudis
 
BI Consultancy - Data, Analytics and Strategy
BI Consultancy - Data, Analytics and StrategyBI Consultancy - Data, Analytics and Strategy
BI Consultancy - Data, Analytics and StrategyShivam Dhawan
 

En vedette (20)

TensorFrames: Google Tensorflow on Apache Spark
TensorFrames: Google Tensorflow on Apache SparkTensorFrames: Google Tensorflow on Apache Spark
TensorFrames: Google Tensorflow on Apache Spark
 
Hadoop Summit 2014 Distributed Deep Learning
Hadoop Summit 2014 Distributed Deep LearningHadoop Summit 2014 Distributed Deep Learning
Hadoop Summit 2014 Distributed Deep Learning
 
Hello Swift Final
Hello Swift FinalHello Swift Final
Hello Swift Final
 
Machine Learning Methods for Parameter Acquisition in a Human ...
Machine Learning Methods for Parameter Acquisition in a Human ...Machine Learning Methods for Parameter Acquisition in a Human ...
Machine Learning Methods for Parameter Acquisition in a Human ...
 
MALT: Distributed Data-Parallelism for Existing ML Applications (Distributed ...
MALT: Distributed Data-Parallelism for Existing ML Applications (Distributed ...MALT: Distributed Data-Parallelism for Existing ML Applications (Distributed ...
MALT: Distributed Data-Parallelism for Existing ML Applications (Distributed ...
 
Spark Summit EU talk by Rolf Jagerman
Spark Summit EU talk by Rolf JagermanSpark Summit EU talk by Rolf Jagerman
Spark Summit EU talk by Rolf Jagerman
 
Data Engineering 101: Building your first data product by Jonathan Dinu PyDat...
Data Engineering 101: Building your first data product by Jonathan Dinu PyDat...Data Engineering 101: Building your first data product by Jonathan Dinu PyDat...
Data Engineering 101: Building your first data product by Jonathan Dinu PyDat...
 
キャンバス個人用アプリ 速習ガイド
キャンバス個人用アプリ 速習ガイドキャンバス個人用アプリ 速習ガイド
キャンバス個人用アプリ 速習ガイド
 
Tns Profile V 12.0
Tns Profile V 12.0Tns Profile V 12.0
Tns Profile V 12.0
 
今さらきけない環境ハブ
今さらきけない環境ハブ今さらきけない環境ハブ
今さらきけない環境ハブ
 
Challenges on Distributed Machine Learning
Challenges on Distributed Machine LearningChallenges on Distributed Machine Learning
Challenges on Distributed Machine Learning
 
Large Scale Distributed Deep Networks
Large Scale Distributed Deep NetworksLarge Scale Distributed Deep Networks
Large Scale Distributed Deep Networks
 
Spark Summit EU talk by Nick Pentreath
Spark Summit EU talk by Nick PentreathSpark Summit EU talk by Nick Pentreath
Spark Summit EU talk by Nick Pentreath
 
Distributed machine learning
Distributed machine learningDistributed machine learning
Distributed machine learning
 
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
 
Distributed implementation of a lstm on spark and tensorflow
Distributed implementation of a lstm on spark and tensorflowDistributed implementation of a lstm on spark and tensorflow
Distributed implementation of a lstm on spark and tensorflow
 
Self-Service.AI - Pitch Competition for AI-Driven SaaS Startups
Self-Service.AI - Pitch Competition for AI-Driven SaaS StartupsSelf-Service.AI - Pitch Competition for AI-Driven SaaS Startups
Self-Service.AI - Pitch Competition for AI-Driven SaaS Startups
 
Spark Based Distributed Deep Learning Framework For Big Data Applications
Spark Based Distributed Deep Learning Framework For Big Data Applications Spark Based Distributed Deep Learning Framework For Big Data Applications
Spark Based Distributed Deep Learning Framework For Big Data Applications
 
FlinkML: Large Scale Machine Learning with Apache Flink
FlinkML: Large Scale Machine Learning with Apache FlinkFlinkML: Large Scale Machine Learning with Apache Flink
FlinkML: Large Scale Machine Learning with Apache Flink
 
BI Consultancy - Data, Analytics and Strategy
BI Consultancy - Data, Analytics and StrategyBI Consultancy - Data, Analytics and Strategy
BI Consultancy - Data, Analytics and Strategy
 

Similaire à Distributed Deep Learning on Spark

CEP - simplified streaming architecture - Strata Singapore 2016
CEP - simplified streaming architecture - Strata Singapore 2016CEP - simplified streaming architecture - Strata Singapore 2016
CEP - simplified streaming architecture - Strata Singapore 2016Mathieu Dumoulin
 
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...Codemotion
 
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...Mathieu Dumoulin
 
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...DataWorks Summit/Hadoop Summit
 
OpenACC Monthly Highlights: June 2020
OpenACC Monthly Highlights: June 2020OpenACC Monthly Highlights: June 2020
OpenACC Monthly Highlights: June 2020OpenACC
 
RAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsRAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsConnected Data World
 
Google cloud Study Jam 2023.pptx
Google cloud Study Jam 2023.pptxGoogle cloud Study Jam 2023.pptx
Google cloud Study Jam 2023.pptxGDSCNiT
 
Near Data Computing Architectures: Opportunities and Challenges for Apache Spark
Near Data Computing Architectures: Opportunities and Challenges for Apache SparkNear Data Computing Architectures: Opportunities and Challenges for Apache Spark
Near Data Computing Architectures: Opportunities and Challenges for Apache SparkAhsan Javed Awan
 
Near Data Computing Architectures for Apache Spark: Challenges and Opportunit...
Near Data Computing Architectures for Apache Spark: Challenges and Opportunit...Near Data Computing Architectures for Apache Spark: Challenges and Opportunit...
Near Data Computing Architectures for Apache Spark: Challenges and Opportunit...Spark Summit
 
Real World Use Cases: Hadoop and NoSQL in Production
Real World Use Cases: Hadoop and NoSQL in ProductionReal World Use Cases: Hadoop and NoSQL in Production
Real World Use Cases: Hadoop and NoSQL in ProductionCodemotion
 
Where is Data Going? - RMDC Keynote
Where is Data Going? - RMDC KeynoteWhere is Data Going? - RMDC Keynote
Where is Data Going? - RMDC KeynoteTed Dunning
 
Open source applied - Real world use cases (Presented at Open Source 101)
Open source applied - Real world use cases (Presented at Open Source 101)Open source applied - Real world use cases (Presented at Open Source 101)
Open source applied - Real world use cases (Presented at Open Source 101)Rogue Wave Software
 
Open Source Applied - Real World Use Cases
Open Source Applied - Real World Use CasesOpen Source Applied - Real World Use Cases
Open Source Applied - Real World Use CasesAll Things Open
 
Your Self-Driving Car - How Did it Get So Smart?
Your Self-Driving Car - How Did it Get So Smart?Your Self-Driving Car - How Did it Get So Smart?
Your Self-Driving Car - How Did it Get So Smart?Hortonworks
 
A performance analysis of OpenStack Cloud vs Real System on Hadoop Clusters
A performance analysis of OpenStack Cloud vs Real System on Hadoop ClustersA performance analysis of OpenStack Cloud vs Real System on Hadoop Clusters
A performance analysis of OpenStack Cloud vs Real System on Hadoop ClustersKumari Surabhi
 
YARN webinar series: Using Scalding to write applications to Hadoop and YARN
YARN webinar series: Using Scalding to write applications to Hadoop and YARNYARN webinar series: Using Scalding to write applications to Hadoop and YARN
YARN webinar series: Using Scalding to write applications to Hadoop and YARNHortonworks
 
OpenACC Monthly Highlights September 2020
OpenACC Monthly Highlights September 2020OpenACC Monthly Highlights September 2020
OpenACC Monthly Highlights September 2020OpenACC
 

Similaire à Distributed Deep Learning on Spark (20)

CEP - simplified streaming architecture - Strata Singapore 2016
CEP - simplified streaming architecture - Strata Singapore 2016CEP - simplified streaming architecture - Strata Singapore 2016
CEP - simplified streaming architecture - Strata Singapore 2016
 
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
 
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
 
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
 
OpenACC Monthly Highlights: June 2020
OpenACC Monthly Highlights: June 2020OpenACC Monthly Highlights: June 2020
OpenACC Monthly Highlights: June 2020
 
RAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsRAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needs
 
Google cloud Study Jam 2023.pptx
Google cloud Study Jam 2023.pptxGoogle cloud Study Jam 2023.pptx
Google cloud Study Jam 2023.pptx
 
Near Data Computing Architectures: Opportunities and Challenges for Apache Spark
Near Data Computing Architectures: Opportunities and Challenges for Apache SparkNear Data Computing Architectures: Opportunities and Challenges for Apache Spark
Near Data Computing Architectures: Opportunities and Challenges for Apache Spark
 
Near Data Computing Architectures for Apache Spark: Challenges and Opportunit...
Near Data Computing Architectures for Apache Spark: Challenges and Opportunit...Near Data Computing Architectures for Apache Spark: Challenges and Opportunit...
Near Data Computing Architectures for Apache Spark: Challenges and Opportunit...
 
Real World Use Cases: Hadoop and NoSQL in Production
Real World Use Cases: Hadoop and NoSQL in ProductionReal World Use Cases: Hadoop and NoSQL in Production
Real World Use Cases: Hadoop and NoSQL in Production
 
Where is Data Going? - RMDC Keynote
Where is Data Going? - RMDC KeynoteWhere is Data Going? - RMDC Keynote
Where is Data Going? - RMDC Keynote
 
Open source applied - Real world use cases (Presented at Open Source 101)
Open source applied - Real world use cases (Presented at Open Source 101)Open source applied - Real world use cases (Presented at Open Source 101)
Open source applied - Real world use cases (Presented at Open Source 101)
 
Open Source Applied - Real World Use Cases
Open Source Applied - Real World Use CasesOpen Source Applied - Real World Use Cases
Open Source Applied - Real World Use Cases
 
Is Spark Replacing Hadoop
Is Spark Replacing HadoopIs Spark Replacing Hadoop
Is Spark Replacing Hadoop
 
Your Self-Driving Car - How Did it Get So Smart?
Your Self-Driving Car - How Did it Get So Smart?Your Self-Driving Car - How Did it Get So Smart?
Your Self-Driving Car - How Did it Get So Smart?
 
A performance analysis of OpenStack Cloud vs Real System on Hadoop Clusters
A performance analysis of OpenStack Cloud vs Real System on Hadoop ClustersA performance analysis of OpenStack Cloud vs Real System on Hadoop Clusters
A performance analysis of OpenStack Cloud vs Real System on Hadoop Clusters
 
YARN webinar series: Using Scalding to write applications to Hadoop and YARN
YARN webinar series: Using Scalding to write applications to Hadoop and YARNYARN webinar series: Using Scalding to write applications to Hadoop and YARN
YARN webinar series: Using Scalding to write applications to Hadoop and YARN
 
Streaming in the Extreme
Streaming in the ExtremeStreaming in the Extreme
Streaming in the Extreme
 
OpenACC Monthly Highlights September 2020
OpenACC Monthly Highlights September 2020OpenACC Monthly Highlights September 2020
OpenACC Monthly Highlights September 2020
 
Introduction to Spark
Introduction to SparkIntroduction to Spark
Introduction to Spark
 

Dernier

%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrandmasabamasaba
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...masabamasaba
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfonteinmasabamasaba
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is insideshinachiaurasa2
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyviewmasabamasaba
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
tonesoftg
tonesoftgtonesoftg
tonesoftglanshi9
 
Harnessing ChatGPT - Elevating Productivity in Today's Agile Environment
Harnessing ChatGPT  - Elevating Productivity in Today's Agile EnvironmentHarnessing ChatGPT  - Elevating Productivity in Today's Agile Environment
Harnessing ChatGPT - Elevating Productivity in Today's Agile EnvironmentVictorSzoltysek
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastPapp Krisztián
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfonteinmasabamasaba
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareJim McKeeth
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...masabamasaba
 

Dernier (20)

%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
tonesoftg
tonesoftgtonesoftg
tonesoftg
 
Harnessing ChatGPT - Elevating Productivity in Today's Agile Environment
Harnessing ChatGPT  - Elevating Productivity in Today's Agile EnvironmentHarnessing ChatGPT  - Elevating Productivity in Today's Agile Environment
Harnessing ChatGPT - Elevating Productivity in Today's Agile Environment
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
 

Distributed Deep Learning on Spark

  • 1. © 2014 MapR Technologies 1© 2014 MapR Technologies Distributed Deep Learning on Spark Mathieu Dumoulin - Data Engineer MapR Professional Services APAC
  • 2. © 2014 MapR Technologies 2 Tonight’s Presentation FAQ-Style • Short intro on machine learning • What’s Deep learning? • Why distributed? Why do we need a computer cluster? • Why run it on Spark? • How does it work? – Case study of SparkNet: Training Deep Networks in Spark – Case Study of CaffeOnSpark • Can I see a Demo? – Installation Process – Caffe demo – CaffeOnSpark demo
  • 3. © 2014 MapR Technologies 3 Machine Learning is all around us! • Internet search with Google and Bing • Contextual ads (Adsense) • Apple iOS 9&10 (interesting link with details!) • Google GMail/Inbox (Priority Inbox, Spam filtering) • Fraud Detection • Recommendations (Amazon) • Image recognition (I can see… cats!) • Language Modeling & Speech Recognition (Siri, Google Now, Google Translate)
  • 4. © 2016 MapR Technologies 4© 2016 MapR Technologies 4MapR Confidential Classification of images
  • 5. © 2016 MapR Technologies 5© 2016 MapR Technologies 5MapR Confidential Why Deep Learning? • Because they work really, really well! • Deep learning is the state of the art in applied machine learning – Wins in every major machine learning competition • Kaggle • ImageNet • Especially well suited for: – Images (classification, object detection, etc) – Sounds (speech, music) – Text (translation) • Deep Learning is very CPU intensive – More processing for better models – More processing for faster training
  • 6. © 2016 MapR Technologies 6© 2016 MapR Technologies 6MapR Confidential MNIST digits task • Classify 60,000 handwritten digits to the correct number Taken from Wikipedia (https://en.wikipedia.org/wiki/MNIST_database) More deep learning results: (http://yann.lecun.com/exdb/mnist/) Type Error rate (%) K-Nearest Neighbors 0.52[14] Support vector machine 0.56[16] Deep neural network 0.35[18] Convolutional neural network 0.23[8]
  • 7. © 2016 MapR Technologies 7© 2016 MapR Technologies 7MapR Confidential Results are now competitive with humans!
  • 8. © 2016 MapR Technologies 8© 2016 MapR Technologies 8MapR Confidential Why Distributed “training can be time consuming, often requiring multiple days on a single GPU using [SGD]” - Moritz et al - SparkNet • The most GPU for one physical node is 3-4 • A cluster can spread the CPU/GPU load at the cost of increased complexity • Google coded such software from scratch early 2010.
  • 9. © 2016 MapR Technologies 9© 2016 MapR Technologies 9MapR Confidential How to Distribute: Parameter Server • Li et al propose the “Parameter Server” approach in 2014 – https://www.cs.cmu.edu/~dga/papers/osdi14-paper-li_mu.pdf From Arimo’s Distributed TensorFlow blog post (link)
  • 10. © 2016 MapR Technologies 10© 2016 MapR Technologies 10MapR Confidential Why Spark? • Integrates well with existing “big data” batch processing frameworks (Hadoop/MapReduce) • Allows data to be kept in memory from start to finish • Work with a single computational framework • Relatively easy to implement parameter server
  • 11. © 2016 MapR Technologies 11© 2016 MapR Technologies 11MapR Confidential New frameworks for spark-based Distributed DL • CaffeOnSpark (Yahoo America) • SparkNet (Berkeley University’s Amplab) • DeepLearning4J (Skymind) • Elephas (Keras team) • Distributed Tensor Flow (Arimo)
  • 12. © 2016 MapR Technologies 12© 2016 MapR Technologies 12MapR Confidential SparkNet implementation From: https://arxiv.org/pdf/1511.06051v4.pdf
  • 13. © 2016 MapR Technologies 13© 2016 MapR Technologies 13MapR Confidential SparkNet implementation 2 From: https://arxiv.org/pdf/1511.06051v4.pdf
  • 14. © 2016 MapR Technologies 14© 2016 MapR Technologies 14MapR Confidential SparkNet implementation 3 From: https://arxiv.org/pdf/1511.06051v4.pdf
  • 15. © 2016 MapR Technologies 15© 2016 MapR Technologies 15MapR Confidential We need a Solver: Caffe ● (+) Good for feedforward networks and image processing ● (+) Good for finetuning existing networks ● (+) Train models without writing any code ● (+) Python interface is pretty useful ● (-) Need to write C++ / CUDA for new GPU layers ● (-) Not good for recurrent networks ● (-) Cumbersome for big networks (GoogLeNet, ResNet) ● (-) Not extensible, bit of a hairball ● (-) No commercial support taken from: http://deeplearning4j.org/compare-dl4j-torch7-pylearn.html#caffe
  • 16. © 2016 MapR Technologies 16© 2016 MapR Technologies 16MapR Confidential Distributed SGD and Parameter Server From: https://arxiv.org/pdf/1511.06051v4.pdf
  • 17. © 2016 MapR Technologies 17© 2016 MapR Technologies 17MapR Confidential SparkNet’s implementation of DSGD From: https://arxiv.org/pdf/1511.06051v4.pdf
  • 18. © 2016 MapR Technologies 18© 2016 MapR Technologies 18MapR Confidential Benefits of the approach From: https://arxiv.org/pdf/1511.06051v4.pdf
  • 19. © 2016 MapR Technologies 19© 2016 MapR Technologies 19MapR Confidential Scaling performance of SparkNet From: https://arxiv.org/pdf/1511.06051v4.pdf
  • 20. © 2016 MapR Technologies 20© 2016 MapR Technologies 20MapR Confidential CaffeOnSpark • Mix Java and Scala implementation • Developed and used in production at Yahoo America • Much easier to install than SparkNet, less buggy • Can take advantage of Infiniband network • Enhanced Caffe to use multi-GPU • CaffeOnSpark executors communicate to each other via MPI allreduce style interface • Spark+MPI architecture achieves similar performance as dedicated deep learning clusters – Peer-to-peer parameter server • Faster than SparkNet
  • 21. © 2016 MapR Technologies 21© 2016 MapR Technologies 21MapR Confidential CaffeOnSpark System Architecture From: http://yahoohadoop.tumblr.com/post/129872361846/large-scale-distributed-deep-learning-on-hadoop
  • 22. © 2016 MapR Technologies 22© 2016 MapR Technologies 22MapR Confidential CaffeOnSpark vs. SparkNet • Much faster communication between nodes (Infiniband capability) • Peer-to-peer parameter exchange model is a much faster implementation • Enhanced multi-GPU Caffe also faster
  • 23. © 2016 MapR Technologies 23© 2016 MapR Technologies 23MapR Confidential Comparison of Frameworks (Spark Summit 2016) By Yu Cao (EMC) and Zhe Dong (EMC) (Slideshare)
  • 24. © 2016 MapR Technologies 24© 2016 MapR Technologies 24MapR Confidential Benchmark 2 By Yu Cao (EMC) and Zhe Dong (EMC) (Slideshare)
  • 25. © 2016 MapR Technologies 25© 2016 MapR Technologies 25MapR Confidential Installing CaffeOnSpark • I recommend Centos 7 or Ubuntu 14+ • Process is very “touchy”, easy to mess up • Go step by step! Process: 1. Update the OS and kernel, install dev tools (gcc, etc.) reboot a. Disable “nouveau” driver!!! 2. Install NVidia Drivers latest, Cuda 7.5, cuDNN 4 3. Install Caffe a. Install all caffe dependencies, make sure it compiles and examples run. 4. Install CaffeOnSpark
  • 26. © 2016 MapR Technologies 26© 2016 MapR Technologies 26MapR Confidential Installing Caffe Good tutorials are quite few! • Ubuntu works more “out of the box” the default paths are all correct • Centos7: a few changes are needed but it’s still OK The caffe web site instructions for Centos are a bit outdated.
  • 27. © 2016 MapR Technologies 27© 2016 MapR Technologies 27MapR Confidential Demos • Running an example on Caffe – Caffe deep network description files – MNIST example • Running an example with CaffeOnSpark – MNIST example – running on YARN/Spark Standalone
  • 28. © 2016 MapR Technologies 28© 2016 MapR Technologies 28MapR Confidential © 2016 MapR Technologies Q&A time