SlideShare une entreprise Scribd logo
1  sur  30
Télécharger pour lire hors ligne
VeryLarge-ScaleDistributed
DeepLearningon BigDL
JasonDai,DingDing
Overview of BigDL
3
BigDL
BringingDeepLearningToBigDataPlatform
• Distributed deep learning framework for Apache
Spark*
• Make deep learning more accessible to big data users
and data scientists
• Write deep learning applications as standard Spark programs
• Run on existing Spark/Hadoop clusters (no changes needed)
• Feature parity with popular deep learning frameworks
• E.g., Caffe, Torch, Tensorflow, etc.
• High performance
• Powered by Intel MKL and multi-threaded programming
• Efficient scale-out
• Leveraging Spark for distributed training & inference
https://github.com/intel-analytics/BigDL
https://bigdl-project.github.io/
Spark Core
SQL SparkR Streaming
MLlib GraphX
ML Pipeline
DataFrame
4
Chasmb/wDeepLearningandBigDataCommunities
Average users (data engineers, data scientists, analysts, etc.)Deep learning experts
The
Chasm
5
BigDLAnsweringTheNeeds
Make deep learning more accessible to big data and data science communities
• Continue the use of familiar SW tools and HW infrastructure to build deep learning
applications
• Analyze “big data” using deep learning on the same Hadoop/Spark cluster where the data
are stored
• Add deep learning functionalities to the Big Data (Spark) programs and/or workflow
• Leverage existing Hadoop/Spark clusters to run deep learning applications
• Shared with other workloads (e.g., ETL, data warehouse, feature engineering, statistic machine
learning, graph analytics, etc.) in a dynamic and elastic fashion
6
FraudDetectionforUnionPay
Train one model
all features selected features
model
candidate model
sampled
partition
Training Data
…
normal
fraud
Train one model
Train one model
sampled
partition
sampled
partition
Post
Processing
Pre-
Processing
model
model
Spark Pipeline
Test Data
Predictions
Test
Spark DataFrameHive Table
Pre-processing
Feature
Engineering
Feature
Engineering
Feature
Selection
Model
Ensemble
SparkPipeline
Neural Network Model Using BigDL
Feature
Selection*
Model
Training
Model
Evaluation
& Fine Tune
https://mp.weixin.qq.com/s?__biz=MzI3NDAwNDUwNg==&mid=2648307335&idx=1&sn=8eb9f63eaf2e40e24a90601b9cc03d1f
DistributedTraining onBigDL
8
DistributedTrainingonBigDL
Training
for (i <- 1 to N) {
batch = next_batch()
output = model.forward(batch.input)
loss = criterion.forward(output, batch.target)
error = criterion.backward(output, batch.target)
model.backward(input, error)
optimMethod.optimize(model.weight, model.gradient)
}
Iterative Data parallel
Synchronous SGD
Mini-batch
9
RunasstandardSparkProgram
Spark
Program
DL App on Driver
Spark
Executor
(JVM)
Spark
Task
BigDL lib
Worker
Intel MKL
Standard
Spark jobs
Worker
Worker Worker
Worker
Spark
Executor
(JVM)
Spark
Task
BigDL lib
Worker
Intel MKL
BigDL
library
Spark
jobs
BigDL Program
Standard Spark jobs
• No changes to the Spark or Hadoop clusters needed
Iterative
• Each iteration of the training runs as a Spark job
Data parallel
• Each Spark task runs the same model on a subset of the data (batch)
10
TechniquesforLarge-ScaleDistributedTraining
• Optimizing parameter synchronization and aggregation
• Optimizing task scheduling
• Scaling batch size
11
ParameterSynchronizationinSparkMLlib
…
Partition 1
Partition 2
Partition n
Training Set
Sample
Sample
Sample
Worker
Worker
Worker
Driver
2
2
2
1
1
1
3
3
3
4
1
2
3
4
Broadcast weight
to each workerEach task computes
gradients
Each task sends gradients
for (tree) aggregation, and
then driver updates the
weight
12
ParameterSynchronizationinBigDL
…
Training Set
PS (Parameter Server) Architecture in BigDL
on top of Spark Block Manager
Partition 1 Partition 2 Partition n
Worker
Gradient
1
2
Weight
3
4
5
Worker
Gradient
1
2
Weight
3
4
5
Worker
Gradient
1
2
Weight
3
4
5
…
… …
… … … …
Peer-2-Peer All-Reduce synchronization
13
ParameterSynchronizationinBigDL
Parameter synchronization time as a fraction of
average compute time for Inception v1 training
For more complete information about performance and benchmark results, visit www.intel.com/benchmarks.
14
TaskSchedulingOverheads
Spark overheads (task scheduling, task serde, task fetch) as
a fraction of average compute time for Inception v1 training
For more complete information about performance and benchmark results, visit www.intel.com/benchmarks.
15
Drizzle
*Collaboration with Shivaram Venkataraman from UC Berkeley RICELab
• Low latency execution engine for Apache
Spark
• Fine-grained execution with coarse-grained
scheduling
• Group scheduling
• Schedule a group of iterations at once
• Fault tolerance, scheduling at group boundaries
• Coordinating shuffles: PRE-SCHEDULING
• Pre-schedule tasks on executors
• Trigger tasks once dependencies are met
16
ReducingSchedulingOverheadswithDrizzle
For more complete information about performance and benchmark results, visit www.intel.com/benchmarks.
17
IncreasedMini-BatchSize
• Distributed synchronous mini-batch SGD
• Increased mini-batch size
total_batch_size = batch_size_per_worker * num_of_workers
• Can lead to loss in test accuracy
• State-of-art method for scaling mini-batch size*
• Linear scaling rule
• Warm-up strategy
• Layer-wise adaptive rate scaling
• Adding batch normalization
* “Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour”
* “Scaling SGD Batch Size to 32K for ImageNet Training”
18
Inceptionv1
Strategies
• Gradual warm-up
• Linear scaling
• Gradient clipping
• TODO: adding batch normalization
19
SSD(SingleShotMulti-boxDetector)
Strategies
• Warm-up w/ Adam
• Linear scaling
• Gradient clipping
Distributed Inference on BigDL
21
DistributedInferenceonBigDL
Inference
for (b <- 1 to D) {
input = next_data(i)
output = model.forward(input)
}
Embarrassingly (data) parallel in nature
Efficiently scale out using BigDL on
Apache Spark
22
ImageDetection&ExtractionPipeline
(usingSSD+DeepBitModels)
23
ChallengesofLarge-ScaleProcessinginGPUSolutions
• Reading images out takes a very long time
• Image pre-processing on HBase is very complex
• No existing software frameworks can be leveraged
• E.g., resource management, distributed data processing, fault tolerance, etc.
• Very challenging to scale out to massive amount of pictures
• Due to SW and HW infrastructure constraints
24
UpgradingtoBigDLSolutions
• Reuse existing Hadoop/Spark clusters for deep learning with no changes
• Efficiently scale out on Spark with superior performance
• Reading HBase data no longer a bottleneck
• Very easy to build the end-to-end pipeline in BigDL
• Image transformation and augmentation based on OpenCV on Spark
val preProcessor = BytesToMat() -> Resize(300, 300) -> …
val transformed = preProcessor(dataRdd)
• Directly Load pre-trained models (BigDL/Caffe/Torch/TensorFLow) into BigDL
val model = Module.loadCaffeModel(caffeDefPath, caffeModelPath)
25
ModelQuantizationforEfficientInferenceinBigDL
• Local quantization scheme converting floats to intergers
• Faster compute and smaller models
• Take advantage of SSE and AVX instructions on Xeon servers
• Supports pre-trained models (BigDL/Caffe/Torch/TensorFLow)
val model = Module.loadCaffeModel(caffeDefPath, caffeModelPath)
val quantizedModel = model.quantize()
• Quantized SSD model
• ~4x model size reduction
• >2x inference speedup
• ~0.001 mAP (mean average precision) loss
For more complete information about performance and benchmark results, visit www.intel.com/benchmarks.
26
TryBigDLOut
Running BigDL, Deep Learning
for Apache Spark, on AWS*
(Amazon* Web Service)
https://aws.amazon.com/blogs/ai/running-
bigdl-deep-learning-for-apache-spark-on-aws/
Use BigDL on Microsoft*
Azure* HDInsight*
https://azure.microsoft.com/en-
us/blog/use-bigdl-on-hdinsight-spark-for-
distributed-deep-learning/
Intel’s BigDL on Databricks*
https://databricks.com/blog/2017/02/09/in
tels-bigdl-databricks.html
BigDL on Alibaba* Cloud
E-MapReduce*
https://yq.aliyun.com/articles/73347
BigDL on CDH* and Cloudera*
Data Science Workbench*
http://blog.cloudera.com/blog/2017/04/bigdl-
on-cdh-and-cloudera-data-science-workbench/
BigDL Distribution in Cray
Urika-XC Analytics Suite
http://www.cray.com/products/analytics/uri
ka-xc
27
PartnerWithUs
https://github.com/intel-analytics/BigDL http://software.intel.com/ai
Stay connected
28
@intelnervana
@intelAI
#intelAI
intelnervana.com
30
LegalNotices&disclaimers
This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel
representative to obtain the latest forecast, schedule, specifications and roadmaps.
Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Learn more at intel.com, or from the
OEM or retailer. No computer system can be absolutely secure.
Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult
other sources of information to evaluate performance as you consider your purchase. For more complete information about performance and benchmark results, visit
http://www.intel.com/performance.
Cost reduction scenarios described are intended as examples of how a given Intel-based product, in the specified circumstances and configurations, may affect future costs and
provide cost savings. Circumstances will vary. Intel does not guarantee any costs or cost reduction.
Statements in this document that refer to Intel’s plans and expectations for the quarter, the year, and the future, are forward-looking statements that involve a number of risks and
uncertainties. A detailed discussion of the factors that could affect Intel’s results and plans is included in Intel’s SEC filings, including the annual report on Form 10-K.
The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata
are available on request.
No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.
Intel does not control or audit third-party benchmark data or the web sites referenced in this document. You should visit the referenced web site and confirm whether referenced
data are accurate.
Intel, the Intel logo, Pentium, Celeron, Atom, Core, Xeon and others are trademarks of Intel Corporation in the U.S. and/or other countries.
*Other names and brands may be claimed as the property of others.
© 2016 Intel Corporation.
Intel Confidential

Contenu connexe

Tendances

Deep Learning to Big Data Analytics on Apache Spark Using BigDL with Xianyan ...
Deep Learning to Big Data Analytics on Apache Spark Using BigDL with Xianyan ...Deep Learning to Big Data Analytics on Apache Spark Using BigDL with Xianyan ...
Deep Learning to Big Data Analytics on Apache Spark Using BigDL with Xianyan ...Databricks
 
The Future of Computing is Distributed
The Future of Computing is DistributedThe Future of Computing is Distributed
The Future of Computing is DistributedAlluxio, Inc.
 
Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...
Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...
Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...Databricks
 
Transparent Hardware Acceleration for Deep Learning
Transparent Hardware Acceleration for Deep LearningTransparent Hardware Acceleration for Deep Learning
Transparent Hardware Acceleration for Deep LearningIndrajit Poddar
 
CI/CD for Machine Learning with Daniel Kobran
CI/CD for Machine Learning with Daniel KobranCI/CD for Machine Learning with Daniel Kobran
CI/CD for Machine Learning with Daniel KobranDatabricks
 
ASGARD Splunk Conf 2016
ASGARD Splunk Conf 2016ASGARD Splunk Conf 2016
ASGARD Splunk Conf 2016Keith Kraus
 
Accelerating Inference in the Data Center with Malini Bhandaru and Karol Zale...
Accelerating Inference in the Data Center with Malini Bhandaru and Karol Zale...Accelerating Inference in the Data Center with Malini Bhandaru and Karol Zale...
Accelerating Inference in the Data Center with Malini Bhandaru and Karol Zale...Databricks
 
Accelerate Your AI Today
Accelerate Your AI TodayAccelerate Your AI Today
Accelerate Your AI TodayDESMOND YUEN
 
Pivotal Greenplum in Action on AWS, Azure, and GCP - Greenplum Summit 2018
Pivotal Greenplum in Action on AWS, Azure, and GCP - Greenplum Summit 2018Pivotal Greenplum in Action on AWS, Azure, and GCP - Greenplum Summit 2018
Pivotal Greenplum in Action on AWS, Azure, and GCP - Greenplum Summit 2018VMware Tanzu
 
Machine Learning, Graph, Text and Geospatial on Postgres and Greenplum - Gree...
Machine Learning, Graph, Text and Geospatial on Postgres and Greenplum - Gree...Machine Learning, Graph, Text and Geospatial on Postgres and Greenplum - Gree...
Machine Learning, Graph, Text and Geospatial on Postgres and Greenplum - Gree...VMware Tanzu
 
Performance Analysis of Apache Spark and Presto in Cloud Environments
Performance Analysis of Apache Spark and Presto in Cloud EnvironmentsPerformance Analysis of Apache Spark and Presto in Cloud Environments
Performance Analysis of Apache Spark and Presto in Cloud EnvironmentsDatabricks
 
Harnessing Spark Catalyst for Custom Data Payloads
Harnessing Spark Catalyst for Custom Data PayloadsHarnessing Spark Catalyst for Custom Data Payloads
Harnessing Spark Catalyst for Custom Data PayloadsSimeon Fitch
 
Accelerating Data Science With GPUs
Accelerating Data Science With GPUsAccelerating Data Science With GPUs
Accelerating Data Science With GPUsiguazio
 
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...Databricks
 
Tesla Accelerated Computing Platform
Tesla Accelerated Computing PlatformTesla Accelerated Computing Platform
Tesla Accelerated Computing Platforminside-BigData.com
 
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...Mathieu Dumoulin
 
GPU-Accelerating UDFs in PySpark with Numba and PyGDF
GPU-Accelerating UDFs in PySpark with Numba and PyGDFGPU-Accelerating UDFs in PySpark with Numba and PyGDF
GPU-Accelerating UDFs in PySpark with Numba and PyGDFKeith Kraus
 
Distributed Deep Learning on Spark
Distributed Deep Learning on SparkDistributed Deep Learning on Spark
Distributed Deep Learning on SparkMathieu Dumoulin
 
Pivotal Greenplum: Postgres-Based. Multi-Cloud. Built for Analytics & AI - Gr...
Pivotal Greenplum: Postgres-Based. Multi-Cloud. Built for Analytics & AI - Gr...Pivotal Greenplum: Postgres-Based. Multi-Cloud. Built for Analytics & AI - Gr...
Pivotal Greenplum: Postgres-Based. Multi-Cloud. Built for Analytics & AI - Gr...VMware Tanzu
 

Tendances (20)

Deep Learning to Big Data Analytics on Apache Spark Using BigDL with Xianyan ...
Deep Learning to Big Data Analytics on Apache Spark Using BigDL with Xianyan ...Deep Learning to Big Data Analytics on Apache Spark Using BigDL with Xianyan ...
Deep Learning to Big Data Analytics on Apache Spark Using BigDL with Xianyan ...
 
The Future of Computing is Distributed
The Future of Computing is DistributedThe Future of Computing is Distributed
The Future of Computing is Distributed
 
Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...
Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...
Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...
 
Transparent Hardware Acceleration for Deep Learning
Transparent Hardware Acceleration for Deep LearningTransparent Hardware Acceleration for Deep Learning
Transparent Hardware Acceleration for Deep Learning
 
CI/CD for Machine Learning with Daniel Kobran
CI/CD for Machine Learning with Daniel KobranCI/CD for Machine Learning with Daniel Kobran
CI/CD for Machine Learning with Daniel Kobran
 
ASGARD Splunk Conf 2016
ASGARD Splunk Conf 2016ASGARD Splunk Conf 2016
ASGARD Splunk Conf 2016
 
Accelerating Inference in the Data Center with Malini Bhandaru and Karol Zale...
Accelerating Inference in the Data Center with Malini Bhandaru and Karol Zale...Accelerating Inference in the Data Center with Malini Bhandaru and Karol Zale...
Accelerating Inference in the Data Center with Malini Bhandaru and Karol Zale...
 
Accelerate Your AI Today
Accelerate Your AI TodayAccelerate Your AI Today
Accelerate Your AI Today
 
Pivotal Greenplum in Action on AWS, Azure, and GCP - Greenplum Summit 2018
Pivotal Greenplum in Action on AWS, Azure, and GCP - Greenplum Summit 2018Pivotal Greenplum in Action on AWS, Azure, and GCP - Greenplum Summit 2018
Pivotal Greenplum in Action on AWS, Azure, and GCP - Greenplum Summit 2018
 
Machine Learning, Graph, Text and Geospatial on Postgres and Greenplum - Gree...
Machine Learning, Graph, Text and Geospatial on Postgres and Greenplum - Gree...Machine Learning, Graph, Text and Geospatial on Postgres and Greenplum - Gree...
Machine Learning, Graph, Text and Geospatial on Postgres and Greenplum - Gree...
 
Performance Analysis of Apache Spark and Presto in Cloud Environments
Performance Analysis of Apache Spark and Presto in Cloud EnvironmentsPerformance Analysis of Apache Spark and Presto in Cloud Environments
Performance Analysis of Apache Spark and Presto in Cloud Environments
 
Harnessing Spark Catalyst for Custom Data Payloads
Harnessing Spark Catalyst for Custom Data PayloadsHarnessing Spark Catalyst for Custom Data Payloads
Harnessing Spark Catalyst for Custom Data Payloads
 
Accelerating Data Science With GPUs
Accelerating Data Science With GPUsAccelerating Data Science With GPUs
Accelerating Data Science With GPUs
 
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
 
Tesla Accelerated Computing Platform
Tesla Accelerated Computing PlatformTesla Accelerated Computing Platform
Tesla Accelerated Computing Platform
 
Rapids: Data Science on GPUs
Rapids: Data Science on GPUsRapids: Data Science on GPUs
Rapids: Data Science on GPUs
 
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
 
GPU-Accelerating UDFs in PySpark with Numba and PyGDF
GPU-Accelerating UDFs in PySpark with Numba and PyGDFGPU-Accelerating UDFs in PySpark with Numba and PyGDF
GPU-Accelerating UDFs in PySpark with Numba and PyGDF
 
Distributed Deep Learning on Spark
Distributed Deep Learning on SparkDistributed Deep Learning on Spark
Distributed Deep Learning on Spark
 
Pivotal Greenplum: Postgres-Based. Multi-Cloud. Built for Analytics & AI - Gr...
Pivotal Greenplum: Postgres-Based. Multi-Cloud. Built for Analytics & AI - Gr...Pivotal Greenplum: Postgres-Based. Multi-Cloud. Built for Analytics & AI - Gr...
Pivotal Greenplum: Postgres-Based. Multi-Cloud. Built for Analytics & AI - Gr...
 

Similaire à Very large scale distributed deep learning on BigDL

Advancing GPU Analytics with RAPIDS Accelerator for Spark and Alluxio
Advancing GPU Analytics with RAPIDS Accelerator for Spark and AlluxioAdvancing GPU Analytics with RAPIDS Accelerator for Spark and Alluxio
Advancing GPU Analytics with RAPIDS Accelerator for Spark and AlluxioAlluxio, Inc.
 
NVIDIA Rapids presentation
NVIDIA Rapids presentationNVIDIA Rapids presentation
NVIDIA Rapids presentationtestSri1
 
How Data Volume Affects Spark Based Data Analytics on a Scale-up Server
How Data Volume Affects Spark Based Data Analytics on a Scale-up ServerHow Data Volume Affects Spark Based Data Analytics on a Scale-up Server
How Data Volume Affects Spark Based Data Analytics on a Scale-up ServerAhsan Javed Awan
 
Hybrid Transactional/Analytics Processing with Spark and IMDGs
Hybrid Transactional/Analytics Processing with Spark and IMDGsHybrid Transactional/Analytics Processing with Spark and IMDGs
Hybrid Transactional/Analytics Processing with Spark and IMDGsAli Hodroj
 
BigDL webinar - Deep Learning Library for Spark
BigDL webinar - Deep Learning Library for SparkBigDL webinar - Deep Learning Library for Spark
BigDL webinar - Deep Learning Library for SparkDESMOND YUEN
 
Graph Data Science at Scale
Graph Data Science at ScaleGraph Data Science at Scale
Graph Data Science at ScaleNeo4j
 
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...Anant Corporation
 
What's New in Upcoming Apache Spark 2.3
What's New in Upcoming Apache Spark 2.3What's New in Upcoming Apache Spark 2.3
What's New in Upcoming Apache Spark 2.3Databricks
 
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSci
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSciStreamline End-to-End AI Pipelines with Intel, Databricks, and OmniSci
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSciIntel® Software
 
sudoers: Benchmarking Hadoop with ALOJA
sudoers: Benchmarking Hadoop with ALOJAsudoers: Benchmarking Hadoop with ALOJA
sudoers: Benchmarking Hadoop with ALOJANicolas Poggi
 
GOAI: GPU-Accelerated Data Science DataSciCon 2017
GOAI: GPU-Accelerated Data Science DataSciCon 2017GOAI: GPU-Accelerated Data Science DataSciCon 2017
GOAI: GPU-Accelerated Data Science DataSciCon 2017Joshua Patterson
 
Enabling a hardware accelerated deep learning data science experience for Apa...
Enabling a hardware accelerated deep learning data science experience for Apa...Enabling a hardware accelerated deep learning data science experience for Apa...
Enabling a hardware accelerated deep learning data science experience for Apa...Indrajit Poddar
 
Performance Characterization and Optimization of In-Memory Data Analytics on ...
Performance Characterization and Optimization of In-Memory Data Analytics on ...Performance Characterization and Optimization of In-Memory Data Analytics on ...
Performance Characterization and Optimization of In-Memory Data Analytics on ...Ahsan Javed Awan
 
Performance advantages of Hadoop ETL offload with the Intel processor-powered...
Performance advantages of Hadoop ETL offload with the Intel processor-powered...Performance advantages of Hadoop ETL offload with the Intel processor-powered...
Performance advantages of Hadoop ETL offload with the Intel processor-powered...Principled Technologies
 
Making Hadoop Realtime by Dr. William Bain of Scaleout Software
Making Hadoop Realtime by Dr. William Bain of Scaleout SoftwareMaking Hadoop Realtime by Dr. William Bain of Scaleout Software
Making Hadoop Realtime by Dr. William Bain of Scaleout SoftwareData Con LA
 
RAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsRAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsConnected Data World
 
Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
 Best Practice of Compression/Decompression Codes in Apache Spark with Sophia... Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...Databricks
 
Spark Driven Big Data Analytics
Spark Driven Big Data AnalyticsSpark Driven Big Data Analytics
Spark Driven Big Data Analyticsinoshg
 
Trends towards the merge of HPC + Big Data systems
Trends towards the merge of HPC + Big Data systemsTrends towards the merge of HPC + Big Data systems
Trends towards the merge of HPC + Big Data systemsIgor José F. Freitas
 
Ultra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & Alluxio
Ultra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & AlluxioUltra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & Alluxio
Ultra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & AlluxioAlluxio, Inc.
 

Similaire à Very large scale distributed deep learning on BigDL (20)

Advancing GPU Analytics with RAPIDS Accelerator for Spark and Alluxio
Advancing GPU Analytics with RAPIDS Accelerator for Spark and AlluxioAdvancing GPU Analytics with RAPIDS Accelerator for Spark and Alluxio
Advancing GPU Analytics with RAPIDS Accelerator for Spark and Alluxio
 
NVIDIA Rapids presentation
NVIDIA Rapids presentationNVIDIA Rapids presentation
NVIDIA Rapids presentation
 
How Data Volume Affects Spark Based Data Analytics on a Scale-up Server
How Data Volume Affects Spark Based Data Analytics on a Scale-up ServerHow Data Volume Affects Spark Based Data Analytics on a Scale-up Server
How Data Volume Affects Spark Based Data Analytics on a Scale-up Server
 
Hybrid Transactional/Analytics Processing with Spark and IMDGs
Hybrid Transactional/Analytics Processing with Spark and IMDGsHybrid Transactional/Analytics Processing with Spark and IMDGs
Hybrid Transactional/Analytics Processing with Spark and IMDGs
 
BigDL webinar - Deep Learning Library for Spark
BigDL webinar - Deep Learning Library for SparkBigDL webinar - Deep Learning Library for Spark
BigDL webinar - Deep Learning Library for Spark
 
Graph Data Science at Scale
Graph Data Science at ScaleGraph Data Science at Scale
Graph Data Science at Scale
 
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
 
What's New in Upcoming Apache Spark 2.3
What's New in Upcoming Apache Spark 2.3What's New in Upcoming Apache Spark 2.3
What's New in Upcoming Apache Spark 2.3
 
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSci
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSciStreamline End-to-End AI Pipelines with Intel, Databricks, and OmniSci
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSci
 
sudoers: Benchmarking Hadoop with ALOJA
sudoers: Benchmarking Hadoop with ALOJAsudoers: Benchmarking Hadoop with ALOJA
sudoers: Benchmarking Hadoop with ALOJA
 
GOAI: GPU-Accelerated Data Science DataSciCon 2017
GOAI: GPU-Accelerated Data Science DataSciCon 2017GOAI: GPU-Accelerated Data Science DataSciCon 2017
GOAI: GPU-Accelerated Data Science DataSciCon 2017
 
Enabling a hardware accelerated deep learning data science experience for Apa...
Enabling a hardware accelerated deep learning data science experience for Apa...Enabling a hardware accelerated deep learning data science experience for Apa...
Enabling a hardware accelerated deep learning data science experience for Apa...
 
Performance Characterization and Optimization of In-Memory Data Analytics on ...
Performance Characterization and Optimization of In-Memory Data Analytics on ...Performance Characterization and Optimization of In-Memory Data Analytics on ...
Performance Characterization and Optimization of In-Memory Data Analytics on ...
 
Performance advantages of Hadoop ETL offload with the Intel processor-powered...
Performance advantages of Hadoop ETL offload with the Intel processor-powered...Performance advantages of Hadoop ETL offload with the Intel processor-powered...
Performance advantages of Hadoop ETL offload with the Intel processor-powered...
 
Making Hadoop Realtime by Dr. William Bain of Scaleout Software
Making Hadoop Realtime by Dr. William Bain of Scaleout SoftwareMaking Hadoop Realtime by Dr. William Bain of Scaleout Software
Making Hadoop Realtime by Dr. William Bain of Scaleout Software
 
RAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsRAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needs
 
Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
 Best Practice of Compression/Decompression Codes in Apache Spark with Sophia... Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
 
Spark Driven Big Data Analytics
Spark Driven Big Data AnalyticsSpark Driven Big Data Analytics
Spark Driven Big Data Analytics
 
Trends towards the merge of HPC + Big Data systems
Trends towards the merge of HPC + Big Data systemsTrends towards the merge of HPC + Big Data systems
Trends towards the merge of HPC + Big Data systems
 
Ultra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & Alluxio
Ultra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & AlluxioUltra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & Alluxio
Ultra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & Alluxio
 

Plus de DESMOND YUEN

2022-AI-Index-Report_Master.pdf
2022-AI-Index-Report_Master.pdf2022-AI-Index-Report_Master.pdf
2022-AI-Index-Report_Master.pdfDESMOND YUEN
 
Small Is the New Big
Small Is the New BigSmall Is the New Big
Small Is the New BigDESMOND YUEN
 
Intel® Blockscale™ ASIC Product Brief
Intel® Blockscale™ ASIC Product BriefIntel® Blockscale™ ASIC Product Brief
Intel® Blockscale™ ASIC Product BriefDESMOND YUEN
 
Cryptography Processing with 3rd Gen Intel Xeon Scalable Processors
Cryptography Processing with 3rd Gen Intel Xeon Scalable ProcessorsCryptography Processing with 3rd Gen Intel Xeon Scalable Processors
Cryptography Processing with 3rd Gen Intel Xeon Scalable ProcessorsDESMOND YUEN
 
Intel 2021 Product Security Report
Intel 2021 Product Security ReportIntel 2021 Product Security Report
Intel 2021 Product Security ReportDESMOND YUEN
 
How can regulation keep up as transformation races ahead? 2022 Global regulat...
How can regulation keep up as transformation races ahead? 2022 Global regulat...How can regulation keep up as transformation races ahead? 2022 Global regulat...
How can regulation keep up as transformation races ahead? 2022 Global regulat...DESMOND YUEN
 
NASA Spinoffs Help Fight Coronavirus, Clean Pollution, Grow Food, More
NASA Spinoffs Help Fight Coronavirus, Clean Pollution, Grow Food, MoreNASA Spinoffs Help Fight Coronavirus, Clean Pollution, Grow Food, More
NASA Spinoffs Help Fight Coronavirus, Clean Pollution, Grow Food, MoreDESMOND YUEN
 
A Survey on Security and Privacy Issues in Edge Computing-Assisted Internet o...
A Survey on Security and Privacy Issues in Edge Computing-Assisted Internet o...A Survey on Security and Privacy Issues in Edge Computing-Assisted Internet o...
A Survey on Security and Privacy Issues in Edge Computing-Assisted Internet o...DESMOND YUEN
 
PUTTING PEOPLE FIRST: ITS IS SMART COMMUNITIES AND CITIES
PUTTING PEOPLE FIRST:  ITS IS SMART COMMUNITIES AND  CITIESPUTTING PEOPLE FIRST:  ITS IS SMART COMMUNITIES AND  CITIES
PUTTING PEOPLE FIRST: ITS IS SMART COMMUNITIES AND CITIESDESMOND YUEN
 
BUILDING AN OPEN RAN ECOSYSTEM FOR EUROPE
BUILDING AN OPEN RAN ECOSYSTEM FOR EUROPEBUILDING AN OPEN RAN ECOSYSTEM FOR EUROPE
BUILDING AN OPEN RAN ECOSYSTEM FOR EUROPEDESMOND YUEN
 
An Introduction to Semiconductors and Intel
An Introduction to Semiconductors and IntelAn Introduction to Semiconductors and Intel
An Introduction to Semiconductors and IntelDESMOND YUEN
 
Changing demographics and economic growth bloom
Changing demographics and economic growth bloomChanging demographics and economic growth bloom
Changing demographics and economic growth bloomDESMOND YUEN
 
Intel’s Impacts on the US Economy
Intel’s Impacts on the US EconomyIntel’s Impacts on the US Economy
Intel’s Impacts on the US EconomyDESMOND YUEN
 
2021 private networks infographics
2021 private networks infographics2021 private networks infographics
2021 private networks infographicsDESMOND YUEN
 
Transforming the Modern City with the Intel-based 5G Smart City Road Side Uni...
Transforming the Modern City with the Intel-based 5G Smart City Road Side Uni...Transforming the Modern City with the Intel-based 5G Smart City Road Side Uni...
Transforming the Modern City with the Intel-based 5G Smart City Road Side Uni...DESMOND YUEN
 
Increasing Throughput per Node for Content Delivery Networks
Increasing Throughput per Node for Content Delivery NetworksIncreasing Throughput per Node for Content Delivery Networks
Increasing Throughput per Node for Content Delivery NetworksDESMOND YUEN
 
3rd Generation Intel® Xeon® Scalable Processor - Achieving 1 Tbps IPsec with ...
3rd Generation Intel® Xeon® Scalable Processor - Achieving 1 Tbps IPsec with ...3rd Generation Intel® Xeon® Scalable Processor - Achieving 1 Tbps IPsec with ...
3rd Generation Intel® Xeon® Scalable Processor - Achieving 1 Tbps IPsec with ...DESMOND YUEN
 
"Life and Learning After One-Hundred Years: Trust Is The Coin Of The Realm."
"Life and Learning After One-Hundred Years: Trust Is The Coin Of The Realm.""Life and Learning After One-Hundred Years: Trust Is The Coin Of The Realm."
"Life and Learning After One-Hundred Years: Trust Is The Coin Of The Realm."DESMOND YUEN
 
Telefónica views on the design, architecture, and technology of 4G/5G Open RA...
Telefónica views on the design, architecture, and technology of 4G/5G Open RA...Telefónica views on the design, architecture, and technology of 4G/5G Open RA...
Telefónica views on the design, architecture, and technology of 4G/5G Open RA...DESMOND YUEN
 
Machine programming
Machine programmingMachine programming
Machine programmingDESMOND YUEN
 

Plus de DESMOND YUEN (20)

2022-AI-Index-Report_Master.pdf
2022-AI-Index-Report_Master.pdf2022-AI-Index-Report_Master.pdf
2022-AI-Index-Report_Master.pdf
 
Small Is the New Big
Small Is the New BigSmall Is the New Big
Small Is the New Big
 
Intel® Blockscale™ ASIC Product Brief
Intel® Blockscale™ ASIC Product BriefIntel® Blockscale™ ASIC Product Brief
Intel® Blockscale™ ASIC Product Brief
 
Cryptography Processing with 3rd Gen Intel Xeon Scalable Processors
Cryptography Processing with 3rd Gen Intel Xeon Scalable ProcessorsCryptography Processing with 3rd Gen Intel Xeon Scalable Processors
Cryptography Processing with 3rd Gen Intel Xeon Scalable Processors
 
Intel 2021 Product Security Report
Intel 2021 Product Security ReportIntel 2021 Product Security Report
Intel 2021 Product Security Report
 
How can regulation keep up as transformation races ahead? 2022 Global regulat...
How can regulation keep up as transformation races ahead? 2022 Global regulat...How can regulation keep up as transformation races ahead? 2022 Global regulat...
How can regulation keep up as transformation races ahead? 2022 Global regulat...
 
NASA Spinoffs Help Fight Coronavirus, Clean Pollution, Grow Food, More
NASA Spinoffs Help Fight Coronavirus, Clean Pollution, Grow Food, MoreNASA Spinoffs Help Fight Coronavirus, Clean Pollution, Grow Food, More
NASA Spinoffs Help Fight Coronavirus, Clean Pollution, Grow Food, More
 
A Survey on Security and Privacy Issues in Edge Computing-Assisted Internet o...
A Survey on Security and Privacy Issues in Edge Computing-Assisted Internet o...A Survey on Security and Privacy Issues in Edge Computing-Assisted Internet o...
A Survey on Security and Privacy Issues in Edge Computing-Assisted Internet o...
 
PUTTING PEOPLE FIRST: ITS IS SMART COMMUNITIES AND CITIES
PUTTING PEOPLE FIRST:  ITS IS SMART COMMUNITIES AND  CITIESPUTTING PEOPLE FIRST:  ITS IS SMART COMMUNITIES AND  CITIES
PUTTING PEOPLE FIRST: ITS IS SMART COMMUNITIES AND CITIES
 
BUILDING AN OPEN RAN ECOSYSTEM FOR EUROPE
BUILDING AN OPEN RAN ECOSYSTEM FOR EUROPEBUILDING AN OPEN RAN ECOSYSTEM FOR EUROPE
BUILDING AN OPEN RAN ECOSYSTEM FOR EUROPE
 
An Introduction to Semiconductors and Intel
An Introduction to Semiconductors and IntelAn Introduction to Semiconductors and Intel
An Introduction to Semiconductors and Intel
 
Changing demographics and economic growth bloom
Changing demographics and economic growth bloomChanging demographics and economic growth bloom
Changing demographics and economic growth bloom
 
Intel’s Impacts on the US Economy
Intel’s Impacts on the US EconomyIntel’s Impacts on the US Economy
Intel’s Impacts on the US Economy
 
2021 private networks infographics
2021 private networks infographics2021 private networks infographics
2021 private networks infographics
 
Transforming the Modern City with the Intel-based 5G Smart City Road Side Uni...
Transforming the Modern City with the Intel-based 5G Smart City Road Side Uni...Transforming the Modern City with the Intel-based 5G Smart City Road Side Uni...
Transforming the Modern City with the Intel-based 5G Smart City Road Side Uni...
 
Increasing Throughput per Node for Content Delivery Networks
Increasing Throughput per Node for Content Delivery NetworksIncreasing Throughput per Node for Content Delivery Networks
Increasing Throughput per Node for Content Delivery Networks
 
3rd Generation Intel® Xeon® Scalable Processor - Achieving 1 Tbps IPsec with ...
3rd Generation Intel® Xeon® Scalable Processor - Achieving 1 Tbps IPsec with ...3rd Generation Intel® Xeon® Scalable Processor - Achieving 1 Tbps IPsec with ...
3rd Generation Intel® Xeon® Scalable Processor - Achieving 1 Tbps IPsec with ...
 
"Life and Learning After One-Hundred Years: Trust Is The Coin Of The Realm."
"Life and Learning After One-Hundred Years: Trust Is The Coin Of The Realm.""Life and Learning After One-Hundred Years: Trust Is The Coin Of The Realm."
"Life and Learning After One-Hundred Years: Trust Is The Coin Of The Realm."
 
Telefónica views on the design, architecture, and technology of 4G/5G Open RA...
Telefónica views on the design, architecture, and technology of 4G/5G Open RA...Telefónica views on the design, architecture, and technology of 4G/5G Open RA...
Telefónica views on the design, architecture, and technology of 4G/5G Open RA...
 
Machine programming
Machine programmingMachine programming
Machine programming
 

Dernier

Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 

Dernier (20)

Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 

Very large scale distributed deep learning on BigDL

  • 3. 3 BigDL BringingDeepLearningToBigDataPlatform • Distributed deep learning framework for Apache Spark* • Make deep learning more accessible to big data users and data scientists • Write deep learning applications as standard Spark programs • Run on existing Spark/Hadoop clusters (no changes needed) • Feature parity with popular deep learning frameworks • E.g., Caffe, Torch, Tensorflow, etc. • High performance • Powered by Intel MKL and multi-threaded programming • Efficient scale-out • Leveraging Spark for distributed training & inference https://github.com/intel-analytics/BigDL https://bigdl-project.github.io/ Spark Core SQL SparkR Streaming MLlib GraphX ML Pipeline DataFrame
  • 4. 4 Chasmb/wDeepLearningandBigDataCommunities Average users (data engineers, data scientists, analysts, etc.)Deep learning experts The Chasm
  • 5. 5 BigDLAnsweringTheNeeds Make deep learning more accessible to big data and data science communities • Continue the use of familiar SW tools and HW infrastructure to build deep learning applications • Analyze “big data” using deep learning on the same Hadoop/Spark cluster where the data are stored • Add deep learning functionalities to the Big Data (Spark) programs and/or workflow • Leverage existing Hadoop/Spark clusters to run deep learning applications • Shared with other workloads (e.g., ETL, data warehouse, feature engineering, statistic machine learning, graph analytics, etc.) in a dynamic and elastic fashion
  • 6. 6 FraudDetectionforUnionPay Train one model all features selected features model candidate model sampled partition Training Data … normal fraud Train one model Train one model sampled partition sampled partition Post Processing Pre- Processing model model Spark Pipeline Test Data Predictions Test Spark DataFrameHive Table Pre-processing Feature Engineering Feature Engineering Feature Selection Model Ensemble SparkPipeline Neural Network Model Using BigDL Feature Selection* Model Training Model Evaluation & Fine Tune https://mp.weixin.qq.com/s?__biz=MzI3NDAwNDUwNg==&mid=2648307335&idx=1&sn=8eb9f63eaf2e40e24a90601b9cc03d1f
  • 8. 8 DistributedTrainingonBigDL Training for (i <- 1 to N) { batch = next_batch() output = model.forward(batch.input) loss = criterion.forward(output, batch.target) error = criterion.backward(output, batch.target) model.backward(input, error) optimMethod.optimize(model.weight, model.gradient) } Iterative Data parallel Synchronous SGD Mini-batch
  • 9. 9 RunasstandardSparkProgram Spark Program DL App on Driver Spark Executor (JVM) Spark Task BigDL lib Worker Intel MKL Standard Spark jobs Worker Worker Worker Worker Spark Executor (JVM) Spark Task BigDL lib Worker Intel MKL BigDL library Spark jobs BigDL Program Standard Spark jobs • No changes to the Spark or Hadoop clusters needed Iterative • Each iteration of the training runs as a Spark job Data parallel • Each Spark task runs the same model on a subset of the data (batch)
  • 10. 10 TechniquesforLarge-ScaleDistributedTraining • Optimizing parameter synchronization and aggregation • Optimizing task scheduling • Scaling batch size
  • 11. 11 ParameterSynchronizationinSparkMLlib … Partition 1 Partition 2 Partition n Training Set Sample Sample Sample Worker Worker Worker Driver 2 2 2 1 1 1 3 3 3 4 1 2 3 4 Broadcast weight to each workerEach task computes gradients Each task sends gradients for (tree) aggregation, and then driver updates the weight
  • 12. 12 ParameterSynchronizationinBigDL … Training Set PS (Parameter Server) Architecture in BigDL on top of Spark Block Manager Partition 1 Partition 2 Partition n Worker Gradient 1 2 Weight 3 4 5 Worker Gradient 1 2 Weight 3 4 5 Worker Gradient 1 2 Weight 3 4 5 … … … … … … … Peer-2-Peer All-Reduce synchronization
  • 13. 13 ParameterSynchronizationinBigDL Parameter synchronization time as a fraction of average compute time for Inception v1 training For more complete information about performance and benchmark results, visit www.intel.com/benchmarks.
  • 14. 14 TaskSchedulingOverheads Spark overheads (task scheduling, task serde, task fetch) as a fraction of average compute time for Inception v1 training For more complete information about performance and benchmark results, visit www.intel.com/benchmarks.
  • 15. 15 Drizzle *Collaboration with Shivaram Venkataraman from UC Berkeley RICELab • Low latency execution engine for Apache Spark • Fine-grained execution with coarse-grained scheduling • Group scheduling • Schedule a group of iterations at once • Fault tolerance, scheduling at group boundaries • Coordinating shuffles: PRE-SCHEDULING • Pre-schedule tasks on executors • Trigger tasks once dependencies are met
  • 16. 16 ReducingSchedulingOverheadswithDrizzle For more complete information about performance and benchmark results, visit www.intel.com/benchmarks.
  • 17. 17 IncreasedMini-BatchSize • Distributed synchronous mini-batch SGD • Increased mini-batch size total_batch_size = batch_size_per_worker * num_of_workers • Can lead to loss in test accuracy • State-of-art method for scaling mini-batch size* • Linear scaling rule • Warm-up strategy • Layer-wise adaptive rate scaling • Adding batch normalization * “Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour” * “Scaling SGD Batch Size to 32K for ImageNet Training”
  • 18. 18 Inceptionv1 Strategies • Gradual warm-up • Linear scaling • Gradient clipping • TODO: adding batch normalization
  • 19. 19 SSD(SingleShotMulti-boxDetector) Strategies • Warm-up w/ Adam • Linear scaling • Gradient clipping
  • 21. 21 DistributedInferenceonBigDL Inference for (b <- 1 to D) { input = next_data(i) output = model.forward(input) } Embarrassingly (data) parallel in nature Efficiently scale out using BigDL on Apache Spark
  • 23. 23 ChallengesofLarge-ScaleProcessinginGPUSolutions • Reading images out takes a very long time • Image pre-processing on HBase is very complex • No existing software frameworks can be leveraged • E.g., resource management, distributed data processing, fault tolerance, etc. • Very challenging to scale out to massive amount of pictures • Due to SW and HW infrastructure constraints
  • 24. 24 UpgradingtoBigDLSolutions • Reuse existing Hadoop/Spark clusters for deep learning with no changes • Efficiently scale out on Spark with superior performance • Reading HBase data no longer a bottleneck • Very easy to build the end-to-end pipeline in BigDL • Image transformation and augmentation based on OpenCV on Spark val preProcessor = BytesToMat() -> Resize(300, 300) -> … val transformed = preProcessor(dataRdd) • Directly Load pre-trained models (BigDL/Caffe/Torch/TensorFLow) into BigDL val model = Module.loadCaffeModel(caffeDefPath, caffeModelPath)
  • 25. 25 ModelQuantizationforEfficientInferenceinBigDL • Local quantization scheme converting floats to intergers • Faster compute and smaller models • Take advantage of SSE and AVX instructions on Xeon servers • Supports pre-trained models (BigDL/Caffe/Torch/TensorFLow) val model = Module.loadCaffeModel(caffeDefPath, caffeModelPath) val quantizedModel = model.quantize() • Quantized SSD model • ~4x model size reduction • >2x inference speedup • ~0.001 mAP (mean average precision) loss For more complete information about performance and benchmark results, visit www.intel.com/benchmarks.
  • 26. 26 TryBigDLOut Running BigDL, Deep Learning for Apache Spark, on AWS* (Amazon* Web Service) https://aws.amazon.com/blogs/ai/running- bigdl-deep-learning-for-apache-spark-on-aws/ Use BigDL on Microsoft* Azure* HDInsight* https://azure.microsoft.com/en- us/blog/use-bigdl-on-hdinsight-spark-for- distributed-deep-learning/ Intel’s BigDL on Databricks* https://databricks.com/blog/2017/02/09/in tels-bigdl-databricks.html BigDL on Alibaba* Cloud E-MapReduce* https://yq.aliyun.com/articles/73347 BigDL on CDH* and Cloudera* Data Science Workbench* http://blog.cloudera.com/blog/2017/04/bigdl- on-cdh-and-cloudera-data-science-workbench/ BigDL Distribution in Cray Urika-XC Analytics Suite http://www.cray.com/products/analytics/uri ka-xc
  • 29.
  • 30. 30 LegalNotices&disclaimers This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest forecast, schedule, specifications and roadmaps. Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Learn more at intel.com, or from the OEM or retailer. No computer system can be absolutely secure. Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase. For more complete information about performance and benchmark results, visit http://www.intel.com/performance. Cost reduction scenarios described are intended as examples of how a given Intel-based product, in the specified circumstances and configurations, may affect future costs and provide cost savings. Circumstances will vary. Intel does not guarantee any costs or cost reduction. Statements in this document that refer to Intel’s plans and expectations for the quarter, the year, and the future, are forward-looking statements that involve a number of risks and uncertainties. A detailed discussion of the factors that could affect Intel’s results and plans is included in Intel’s SEC filings, including the annual report on Form 10-K. The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document. Intel does not control or audit third-party benchmark data or the web sites referenced in this document. You should visit the referenced web site and confirm whether referenced data are accurate. Intel, the Intel logo, Pentium, Celeron, Atom, Core, Xeon and others are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. © 2016 Intel Corporation. Intel Confidential