SlideShare une entreprise Scribd logo
1  sur  22
Superpowered to
Low Power
Deploying supercomputer-trained deep learning
models for inference on Intel® FPGAs
Lucas A. Wilson, PhD
HPC & AI Engineering, Dell EMC
@lucasawilson
2
Dell EMC / Intel / SURFSara Collaboration
Lucas A. Wilson, Vineet Gundecha, and Alex Filby
Valeriu Codreanu and Damian Podareanu
Vikram Saletore and Shawn Slockers
3
Two Phases of Machine Learning
Training
Learn a suitable function
approximation to correctly
respond to the overwhelming
majority of test cases
Inference / Deployment
Use learned function approximation
to respond to new cases
Training
Deep Neural Networks
to Identify Thoracic Pathologies
5
Superpowered Neural Network Training
386845
8319
16532
1742 1362 825 675
0
50000
100000
150000
200000
250000
300000
350000
400000
450000
DenseNet121, P=1,
BZ=8
DenseNet121, P=64,
BZ=64, GBZ=4096
VGG16, P=128,
GBZ=8192
ResNet50, P=512,
GBZ=4096
Resnet50, P=512,
GBZ=8192
ResNet50, P=800,
GBZ=8000
ResNet50, P=1024,
GBZ=8192
TimetoSolution(seconds)
4.5 DAYS
to reach a solution with
DenseNet121 using 2
Intel® Xeon® Scalable
Gold 6148 processors
11.25 MINUTES
to reach a solution
with ResNet50 using
512 Intel® Xeon®
Scalable Gold 6148
processors!
573x FASTER
total time to solution going
from 1 to 256 Dell EMC
PowerEdge C6420 nodes
6
The Price of Performance
256x
Dell EMC PowerEdge C6420 Nodes
~450W*
Intel® Xeon® Gold 6148+6148F
*Measured median usage per node
675 sec.
Runtime to Train Model
21.6 KwH
Power to Train Radiologist Model
7
21.6 KwH of Electricity is Equivalent to…
Emitting 20 lbs of CO2
by burning anthracite coal
https://www.eia.gov/tools/faqs/faq.php?id=73&t=11
Keeping 360 60W
Light Bulbs on for 1 hour
Running a Hair Dryer
for 18 hours
Running a
Student Cluster Challenge
system for 7 hrs!
*At maximum allowed power
Deploying
Neural Network Models
9
Why Deploy Models Elsewhere?
Remote Deployment Edge Deployment
Power/Thermal
Constrained Deployment
10
Deploying the trained model
Dell EMC PowerEdge C6420
Intel® Xeon® Scalable Gold 6148+6148F
Dell EMC PowerEdge R740
2x Intel® Xeon® Scalable Gold 6136 (300W)
4x Intel® PACs
Dell EMC PowerEdge R640
2x Intel® Xeon® Scalable Gold 5118 (210W)
2x Intel® PACs
11
Preparing the Model for Inference*
*according to the instructions
Remove Horovod nodes from TensorFlow graph
tensorflow/optimize_for_inference.py
Apply checkpoint weights and generate binary file
tensorflow/freeze_graph.py
INTEL®
OPENVINO™
Provide binary file to Intel® distribution of
OpenVINO™ toolkit
12
Preparing the Model for Inference*
*the way that actually worked
INTEL®
OPENVINO™
Save Checkpoint Files While Training
Load weights into equivalent Keras model
and generate TF checkpoint and protobuf
Make sure to set training mode to False (0)
Apply checkpoint weights and generate
binary file
tensorflow/freeze_graph.py
Provide binary file to Intel® distribution of
OpenVINO™ toolkit
13
Quantization
FP16 ← FP32
FP11 ← FP32
Reduce precision to improve throughput and memory utilization
14
Flashing the Bitstream to Intel® PAC with Arria® 10
aocl program <target> <bitstream_dir>/2-0-
1_RC_FP11_ResNet50-101.aocx
4 targets available in test system: acl[0-3]
Precision Topology
Target device
15
Executing the Model on FPGA/CPU using
Intel® distribution of OpenVINO™ toolkit
Image Preprocessing Convolution/Activation/BatchNorm
Classification
150,528B (224x224x3)
Image data
Pre-classification model output
114,744B (28,686 x 32b)
Performance
17
Power Consumption – Intel® PAC with Arria® 10
20 12 11
0 10 20 30 40 50
Power Consumption (W)
Unprogrammed Programmed Streaming Data
18
183 183 183 183 183
41 43
170
411
583
87
105
420
325
745
0
100
200
300
400
500
600
700
800
900
0
100
200
300
400
500
600
700
800
Intel® PAC w/ Arria®
10, batch 1, FP11
Intel® PAC w/ Arria®
10, batch 96, FP11
4x Intel® PAC w/
Arria® 10, batch 384,
FP11
Intel® Xeon® Gold
6136 (24 cores), batch
96, FP32
Intel® Xeon® Gold
6136 (23 cores) + 4x
Intel® PAC w/ Arria®
10, batch 480, FP32/11
Watts
Imagespersecond
Idle Power Used Power Images/second
Power and Performance – Dell EMC PowerEdge R740
Inference performance tests using ResNet-50 topology. FP11 precision. Image size is 224x224x3. Power measured via racadm.
Intel® distribution of OpenVINO™ toolkit version R3
2.3x Throughput
by using 4 Intel® FPGAs with
2 Intel® Xeon® Gold 6136
19
0.39
0.46
0.55
0.97
1.19
0
0.2
0.4
0.6
0.8
1
1.2
1.4
Intel® PAC w/ Arria® 10,
batch 1, FP11
Intel® PAC w/ Arria® 10,
batch 96, FP11
Intel® Xeon® Gold 6136
(24 cores), batch 96,
FP32
Intel® Xeon® Gold 6136
(23 cores) + 4x Intel®
PAC w/ Arria® 10, batch
480, FP32/11
4x Intel® PAC w/ Arria®
10, batch 384, FP11
Imagespersecondperwatt
Power and Performance – Dell EMC PowerEdge R740
116% Efficiency Boost
using 4x Intel® FPGAs vs
2x Intel® Xeon® Gold 6136
Inference performance tests using ResNet-50 topology. FP11 precision. Image size is 224x224x3. Power measured via racadm.
Intel® distribution of OpenVINO™ toolkit version R3
76% Efficiency Boost
using 4x Intel® FPGAs with
2x Intel® Xeon® Gold 6136
20
128 128 128 128 128
41 43
86
116
202
75
105
210
135
345
0
50
100
150
200
250
300
350
400
Intel® PAC w/ Arria® 10,
batch 1, FP11
Intel® PAC w/ Arria® 10,
batch 96, FP11
2x Intel® PAC w/ Arria®
10, batch 192, FP11
Intel® Xeon® Gold 5118
(24 cores), batch 96,
FP32
Intel® Xeon ® Gold 5118
(23 cores) + 2x Intel®
PAC w/ Arria® 10, batch
288, FP32/11
Idle Power Used Power Images/second
Power and Performance – Dell EMC PowerEdge R640
Inference performance tests using ResNet-50 topology. FP11 precision. Image size is 224x224x3. Power measured via racadm.
Intel® distribution of OpenVINO™ toolkit version R3
2.6x Throughput
by using 2 Intel® FPGAs with
2 Intel® Xeon® Gold 5118
21
0.44
0.55
0.61
0.98
1.05
0
0.2
0.4
0.6
0.8
1
1.2
Intel® PAC w/Arria®
10, batch 1, FP11
Intel® Xeon Gold 5118
(24 cores), batch 96,
FP32
Intel® PAC w/ Arria®
10, batch 96, FP11
2x Intel® PAC w/ Arria®
10, batch 192, FP11
Intel® Xeon ® Gold
5118 (23 cores) + 2x
Intel® PAC w/ Arria®
10, batch 288, FP32/11
Imagespersecondperwatt
Power and Performance – Dell EMC PowerEdge R640
91% Efficiency Boost
adding 2x Intel® FPGAs to
2x Intel® Xeon® Gold 5118
78% Efficiency Boost
using 2x Intel® FPGAs vs
2x Intel® Xeon® Gold 5118
Inference performance tests using ResNet-50 topology. FP11 precision. Image size is 224x224x3. Power measured via racadm.
Intel® distribution of OpenVINO™ toolkit version R3
22

Contenu connexe

Tendances

Hands-on Lab: How to Unleash Your Storage Performance by Using NVM Express™ B...
Hands-on Lab: How to Unleash Your Storage Performance by Using NVM Express™ B...Hands-on Lab: How to Unleash Your Storage Performance by Using NVM Express™ B...
Hands-on Lab: How to Unleash Your Storage Performance by Using NVM Express™ B...Odinot Stanislas
 
Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)
Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)
Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)Johann Lombardi
 
Intel Knights Landing Slides
Intel Knights Landing SlidesIntel Knights Landing Slides
Intel Knights Landing SlidesRonen Mendezitsky
 
Data-Intensive Workflows with DAOS
Data-Intensive Workflows with DAOSData-Intensive Workflows with DAOS
Data-Intensive Workflows with DAOSJohann Lombardi
 
NNSA Explorations: ARM for Supercomputing
NNSA Explorations: ARM for SupercomputingNNSA Explorations: ARM for Supercomputing
NNSA Explorations: ARM for Supercomputinginside-BigData.com
 
Hardware & Software Platforms for HPC, AI and ML
Hardware & Software Platforms for HPC, AI and MLHardware & Software Platforms for HPC, AI and ML
Hardware & Software Platforms for HPC, AI and MLinside-BigData.com
 
IMCSummit 2015 - Day 1 Developer Track - Evolution of non-volatile memory exp...
IMCSummit 2015 - Day 1 Developer Track - Evolution of non-volatile memory exp...IMCSummit 2015 - Day 1 Developer Track - Evolution of non-volatile memory exp...
IMCSummit 2015 - Day 1 Developer Track - Evolution of non-volatile memory exp...In-Memory Computing Summit
 
Xilinx Edge Compute using Power 9 /OpenPOWER systems
Xilinx Edge Compute using Power 9 /OpenPOWER systemsXilinx Edge Compute using Power 9 /OpenPOWER systems
Xilinx Edge Compute using Power 9 /OpenPOWER systemsGanesan Narayanasamy
 
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APU
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APUHot Chips: AMD Next Gen 7nm Ryzen 4000 APU
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APUAMD
 
Intel® Select Solutions for the Network
Intel® Select Solutions for the NetworkIntel® Select Solutions for the Network
Intel® Select Solutions for the NetworkLiz Warner
 
Ceph Day KL - Ceph on All-Flash Storage
Ceph Day KL - Ceph on All-Flash Storage Ceph Day KL - Ceph on All-Flash Storage
Ceph Day KL - Ceph on All-Flash Storage Ceph Community
 
Under the Armor of Knights Corner: Intel MIC Architecture at Hotchips 2012
Under the Armor of Knights Corner: Intel MIC Architecture at Hotchips 2012Under the Armor of Knights Corner: Intel MIC Architecture at Hotchips 2012
Under the Armor of Knights Corner: Intel MIC Architecture at Hotchips 2012Intel IT Center
 
Building an open memory-centric computing architecture using intel optane
Building an open memory-centric computing architecture using intel optaneBuilding an open memory-centric computing architecture using intel optane
Building an open memory-centric computing architecture using intel optaneUniFabric
 
DAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI Convergence
DAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI ConvergenceDAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI Convergence
DAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI Convergenceinside-BigData.com
 
Ceph Day Seoul - Ceph on Arm Scaleable and Efficient
Ceph Day Seoul - Ceph on Arm Scaleable and Efficient Ceph Day Seoul - Ceph on Arm Scaleable and Efficient
Ceph Day Seoul - Ceph on Arm Scaleable and Efficient Ceph Community
 
AMD EPYC 7002 Launch World Records
AMD EPYC 7002 Launch World RecordsAMD EPYC 7002 Launch World Records
AMD EPYC 7002 Launch World RecordsAMD
 
DPDK & Layer 4 Packet Processing
DPDK & Layer 4 Packet ProcessingDPDK & Layer 4 Packet Processing
DPDK & Layer 4 Packet ProcessingMichelle Holley
 
Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...inside-BigData.com
 

Tendances (20)

Hands-on Lab: How to Unleash Your Storage Performance by Using NVM Express™ B...
Hands-on Lab: How to Unleash Your Storage Performance by Using NVM Express™ B...Hands-on Lab: How to Unleash Your Storage Performance by Using NVM Express™ B...
Hands-on Lab: How to Unleash Your Storage Performance by Using NVM Express™ B...
 
Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)
Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)
Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)
 
Intel Knights Landing Slides
Intel Knights Landing SlidesIntel Knights Landing Slides
Intel Knights Landing Slides
 
Data-Intensive Workflows with DAOS
Data-Intensive Workflows with DAOSData-Intensive Workflows with DAOS
Data-Intensive Workflows with DAOS
 
NNSA Explorations: ARM for Supercomputing
NNSA Explorations: ARM for SupercomputingNNSA Explorations: ARM for Supercomputing
NNSA Explorations: ARM for Supercomputing
 
Hardware & Software Platforms for HPC, AI and ML
Hardware & Software Platforms for HPC, AI and MLHardware & Software Platforms for HPC, AI and ML
Hardware & Software Platforms for HPC, AI and ML
 
IMCSummit 2015 - Day 1 Developer Track - Evolution of non-volatile memory exp...
IMCSummit 2015 - Day 1 Developer Track - Evolution of non-volatile memory exp...IMCSummit 2015 - Day 1 Developer Track - Evolution of non-volatile memory exp...
IMCSummit 2015 - Day 1 Developer Track - Evolution of non-volatile memory exp...
 
Xilinx Edge Compute using Power 9 /OpenPOWER systems
Xilinx Edge Compute using Power 9 /OpenPOWER systemsXilinx Edge Compute using Power 9 /OpenPOWER systems
Xilinx Edge Compute using Power 9 /OpenPOWER systems
 
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APU
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APUHot Chips: AMD Next Gen 7nm Ryzen 4000 APU
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APU
 
Ac922 cdac webinar
Ac922 cdac webinarAc922 cdac webinar
Ac922 cdac webinar
 
Intel® Select Solutions for the Network
Intel® Select Solutions for the NetworkIntel® Select Solutions for the Network
Intel® Select Solutions for the Network
 
Ceph Day KL - Ceph on All-Flash Storage
Ceph Day KL - Ceph on All-Flash Storage Ceph Day KL - Ceph on All-Flash Storage
Ceph Day KL - Ceph on All-Flash Storage
 
Under the Armor of Knights Corner: Intel MIC Architecture at Hotchips 2012
Under the Armor of Knights Corner: Intel MIC Architecture at Hotchips 2012Under the Armor of Knights Corner: Intel MIC Architecture at Hotchips 2012
Under the Armor of Knights Corner: Intel MIC Architecture at Hotchips 2012
 
Building an open memory-centric computing architecture using intel optane
Building an open memory-centric computing architecture using intel optaneBuilding an open memory-centric computing architecture using intel optane
Building an open memory-centric computing architecture using intel optane
 
DAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI Convergence
DAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI ConvergenceDAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI Convergence
DAOS - Scale-Out Software-Defined Storage for HPC/Big Data/AI Convergence
 
HOW Series: Knights Landing
HOW Series: Knights LandingHOW Series: Knights Landing
HOW Series: Knights Landing
 
Ceph Day Seoul - Ceph on Arm Scaleable and Efficient
Ceph Day Seoul - Ceph on Arm Scaleable and Efficient Ceph Day Seoul - Ceph on Arm Scaleable and Efficient
Ceph Day Seoul - Ceph on Arm Scaleable and Efficient
 
AMD EPYC 7002 Launch World Records
AMD EPYC 7002 Launch World RecordsAMD EPYC 7002 Launch World Records
AMD EPYC 7002 Launch World Records
 
DPDK & Layer 4 Packet Processing
DPDK & Layer 4 Packet ProcessingDPDK & Layer 4 Packet Processing
DPDK & Layer 4 Packet Processing
 
Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...
 

Similaire à FPGA Inference - DellEMC SURFsara

HPE and NVIDIA empowering AI and IoT
HPE and NVIDIA empowering AI and IoTHPE and NVIDIA empowering AI and IoT
HPE and NVIDIA empowering AI and IoTRenee Yao
 
Data Science Week 2016. NVIDIA. "Платформы и инструменты для реализации систе...
Data Science Week 2016. NVIDIA. "Платформы и инструменты для реализации систе...Data Science Week 2016. NVIDIA. "Платформы и инструменты для реализации систе...
Data Science Week 2016. NVIDIA. "Платформы и инструменты для реализации систе...Newprolab
 
Dell and NVIDIA for Your AI workloads in the Data Center
Dell and NVIDIA for Your AI workloads in the Data CenterDell and NVIDIA for Your AI workloads in the Data Center
Dell and NVIDIA for Your AI workloads in the Data CenterRenee Yao
 
Ac922 watson 180208 v1
Ac922 watson 180208 v1Ac922 watson 180208 v1
Ac922 watson 180208 v1IBM Sverige
 
Optimization Deep Dive: Unreal Engine 4 on Intel
Optimization Deep Dive: Unreal Engine 4 on IntelOptimization Deep Dive: Unreal Engine 4 on Intel
Optimization Deep Dive: Unreal Engine 4 on IntelIntel® Software
 
Impact of Intel Optane Technology on HPC
Impact of Intel Optane Technology on HPCImpact of Intel Optane Technology on HPC
Impact of Intel Optane Technology on HPCMemVerge
 
Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...
Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...
Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...Danielle Womboldt
 
Ceph Day Beijing - Optimizing Ceph performance by leveraging Intel Optane and...
Ceph Day Beijing - Optimizing Ceph performance by leveraging Intel Optane and...Ceph Day Beijing - Optimizing Ceph performance by leveraging Intel Optane and...
Ceph Day Beijing - Optimizing Ceph performance by leveraging Intel Optane and...Ceph Community
 
Training course lect1
Training course lect1Training course lect1
Training course lect1Noor Dhiya
 
NNECST: an FPGA-based approach for the hardware acceleration of Convolutional...
NNECST: an FPGA-based approach for the hardware acceleration of Convolutional...NNECST: an FPGA-based approach for the hardware acceleration of Convolutional...
NNECST: an FPGA-based approach for the hardware acceleration of Convolutional...NECST Lab @ Politecnico di Milano
 
3.INTEL.Optane_on_ceph_v2.pdf
3.INTEL.Optane_on_ceph_v2.pdf3.INTEL.Optane_on_ceph_v2.pdf
3.INTEL.Optane_on_ceph_v2.pdfhellobank1
 
Profiling deep learning network using NVIDIA nsight systems
Profiling deep learning network using NVIDIA nsight systemsProfiling deep learning network using NVIDIA nsight systems
Profiling deep learning network using NVIDIA nsight systemsJack (Jaegeun) Han
 
clCaffe*: Unleashing the Power of Intel Graphics for Deep Learning Acceleration
clCaffe*: Unleashing the Power of Intel Graphics for Deep Learning AccelerationclCaffe*: Unleashing the Power of Intel Graphics for Deep Learning Acceleration
clCaffe*: Unleashing the Power of Intel Graphics for Deep Learning AccelerationIntel® Software
 
IBM Cloud Paris 20180517 - La solution Power AI
IBM Cloud Paris 20180517 - La solution Power AIIBM Cloud Paris 20180517 - La solution Power AI
IBM Cloud Paris 20180517 - La solution Power AIIBM France Lab
 
Final training course
Final training courseFinal training course
Final training courseNoor Dhiya
 
Austin big data ai meetup march 14
Austin big data ai meetup march 14Austin big data ai meetup march 14
Austin big data ai meetup march 14Clarisse Hedglin
 
CNNECST: an FPGA-based approach for the hardware acceleration of Convolutiona...
CNNECST: an FPGA-based approach for the hardware acceleration of Convolutiona...CNNECST: an FPGA-based approach for the hardware acceleration of Convolutiona...
CNNECST: an FPGA-based approach for the hardware acceleration of Convolutiona...NECST Lab @ Politecnico di Milano
 
CNNECST: an FPGA-based approach for the hardware acceleration of Convolutiona...
CNNECST: an FPGA-based approach for the hardware acceleration of Convolutiona...CNNECST: an FPGA-based approach for the hardware acceleration of Convolutiona...
CNNECST: an FPGA-based approach for the hardware acceleration of Convolutiona...NECST Lab @ Politecnico di Milano
 
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...Red_Hat_Storage
 
CFD Acceleration with FPGA (byteLAKE's & Xilinx's presentation from H2RC work...
CFD Acceleration with FPGA (byteLAKE's & Xilinx's presentation from H2RC work...CFD Acceleration with FPGA (byteLAKE's & Xilinx's presentation from H2RC work...
CFD Acceleration with FPGA (byteLAKE's & Xilinx's presentation from H2RC work...byteLAKE
 

Similaire à FPGA Inference - DellEMC SURFsara (20)

HPE and NVIDIA empowering AI and IoT
HPE and NVIDIA empowering AI and IoTHPE and NVIDIA empowering AI and IoT
HPE and NVIDIA empowering AI and IoT
 
Data Science Week 2016. NVIDIA. "Платформы и инструменты для реализации систе...
Data Science Week 2016. NVIDIA. "Платформы и инструменты для реализации систе...Data Science Week 2016. NVIDIA. "Платформы и инструменты для реализации систе...
Data Science Week 2016. NVIDIA. "Платформы и инструменты для реализации систе...
 
Dell and NVIDIA for Your AI workloads in the Data Center
Dell and NVIDIA for Your AI workloads in the Data CenterDell and NVIDIA for Your AI workloads in the Data Center
Dell and NVIDIA for Your AI workloads in the Data Center
 
Ac922 watson 180208 v1
Ac922 watson 180208 v1Ac922 watson 180208 v1
Ac922 watson 180208 v1
 
Optimization Deep Dive: Unreal Engine 4 on Intel
Optimization Deep Dive: Unreal Engine 4 on IntelOptimization Deep Dive: Unreal Engine 4 on Intel
Optimization Deep Dive: Unreal Engine 4 on Intel
 
Impact of Intel Optane Technology on HPC
Impact of Intel Optane Technology on HPCImpact of Intel Optane Technology on HPC
Impact of Intel Optane Technology on HPC
 
Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...
Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...
Ceph Day Beijing - Optimizing Ceph Performance by Leveraging Intel Optane and...
 
Ceph Day Beijing - Optimizing Ceph performance by leveraging Intel Optane and...
Ceph Day Beijing - Optimizing Ceph performance by leveraging Intel Optane and...Ceph Day Beijing - Optimizing Ceph performance by leveraging Intel Optane and...
Ceph Day Beijing - Optimizing Ceph performance by leveraging Intel Optane and...
 
Training course lect1
Training course lect1Training course lect1
Training course lect1
 
NNECST: an FPGA-based approach for the hardware acceleration of Convolutional...
NNECST: an FPGA-based approach for the hardware acceleration of Convolutional...NNECST: an FPGA-based approach for the hardware acceleration of Convolutional...
NNECST: an FPGA-based approach for the hardware acceleration of Convolutional...
 
3.INTEL.Optane_on_ceph_v2.pdf
3.INTEL.Optane_on_ceph_v2.pdf3.INTEL.Optane_on_ceph_v2.pdf
3.INTEL.Optane_on_ceph_v2.pdf
 
Profiling deep learning network using NVIDIA nsight systems
Profiling deep learning network using NVIDIA nsight systemsProfiling deep learning network using NVIDIA nsight systems
Profiling deep learning network using NVIDIA nsight systems
 
clCaffe*: Unleashing the Power of Intel Graphics for Deep Learning Acceleration
clCaffe*: Unleashing the Power of Intel Graphics for Deep Learning AccelerationclCaffe*: Unleashing the Power of Intel Graphics for Deep Learning Acceleration
clCaffe*: Unleashing the Power of Intel Graphics for Deep Learning Acceleration
 
IBM Cloud Paris 20180517 - La solution Power AI
IBM Cloud Paris 20180517 - La solution Power AIIBM Cloud Paris 20180517 - La solution Power AI
IBM Cloud Paris 20180517 - La solution Power AI
 
Final training course
Final training courseFinal training course
Final training course
 
Austin big data ai meetup march 14
Austin big data ai meetup march 14Austin big data ai meetup march 14
Austin big data ai meetup march 14
 
CNNECST: an FPGA-based approach for the hardware acceleration of Convolutiona...
CNNECST: an FPGA-based approach for the hardware acceleration of Convolutiona...CNNECST: an FPGA-based approach for the hardware acceleration of Convolutiona...
CNNECST: an FPGA-based approach for the hardware acceleration of Convolutiona...
 
CNNECST: an FPGA-based approach for the hardware acceleration of Convolutiona...
CNNECST: an FPGA-based approach for the hardware acceleration of Convolutiona...CNNECST: an FPGA-based approach for the hardware acceleration of Convolutiona...
CNNECST: an FPGA-based approach for the hardware acceleration of Convolutiona...
 
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
Red Hat Storage Day Atlanta - Designing Ceph Clusters Using Intel-Based Hardw...
 
CFD Acceleration with FPGA (byteLAKE's & Xilinx's presentation from H2RC work...
CFD Acceleration with FPGA (byteLAKE's & Xilinx's presentation from H2RC work...CFD Acceleration with FPGA (byteLAKE's & Xilinx's presentation from H2RC work...
CFD Acceleration with FPGA (byteLAKE's & Xilinx's presentation from H2RC work...
 

Plus de Intel IT Center

INFOGRAPHIC: Advantages of Intel vs. IBM Power on SAP HANA solutions
INFOGRAPHIC: Advantages of Intel vs. IBM Power on SAP HANA solutionsINFOGRAPHIC: Advantages of Intel vs. IBM Power on SAP HANA solutions
INFOGRAPHIC: Advantages of Intel vs. IBM Power on SAP HANA solutionsIntel IT Center
 
Disrupt Hackers With Robust User Authentication
Disrupt Hackers With Robust User AuthenticationDisrupt Hackers With Robust User Authentication
Disrupt Hackers With Robust User AuthenticationIntel IT Center
 
Strengthen Your Enterprise Arsenal Against Cyber Attacks With Hardware-Enhanc...
Strengthen Your Enterprise Arsenal Against Cyber Attacks With Hardware-Enhanc...Strengthen Your Enterprise Arsenal Against Cyber Attacks With Hardware-Enhanc...
Strengthen Your Enterprise Arsenal Against Cyber Attacks With Hardware-Enhanc...Intel IT Center
 
Harness Digital Disruption to Create 2022’s Workplace Today
Harness Digital Disruption to Create 2022’s Workplace TodayHarness Digital Disruption to Create 2022’s Workplace Today
Harness Digital Disruption to Create 2022’s Workplace TodayIntel IT Center
 
Don't Rely on Software Alone. Protect Endpoints with Hardware-Enhanced Security.
Don't Rely on Software Alone.Protect Endpoints with Hardware-Enhanced Security.Don't Rely on Software Alone.Protect Endpoints with Hardware-Enhanced Security.
Don't Rely on Software Alone. Protect Endpoints with Hardware-Enhanced Security.Intel IT Center
 
Achieve Unconstrained Collaboration in a Digital World
Achieve Unconstrained Collaboration in a Digital WorldAchieve Unconstrained Collaboration in a Digital World
Achieve Unconstrained Collaboration in a Digital WorldIntel IT Center
 
Intel® Xeon® Scalable Processors Enabled Applications Marketing Guide
Intel® Xeon® Scalable Processors Enabled Applications Marketing GuideIntel® Xeon® Scalable Processors Enabled Applications Marketing Guide
Intel® Xeon® Scalable Processors Enabled Applications Marketing GuideIntel IT Center
 
#NABshow: National Association of Broadcasters 2017 Super Session Presentatio...
#NABshow: National Association of Broadcasters 2017 Super Session Presentatio...#NABshow: National Association of Broadcasters 2017 Super Session Presentatio...
#NABshow: National Association of Broadcasters 2017 Super Session Presentatio...Intel IT Center
 
Identity Protection for the Digital Age
Identity Protection for the Digital AgeIdentity Protection for the Digital Age
Identity Protection for the Digital AgeIntel IT Center
 
Three Steps to Making a Digital Workplace a Reality
Three Steps to Making a Digital Workplace a RealityThree Steps to Making a Digital Workplace a Reality
Three Steps to Making a Digital Workplace a RealityIntel IT Center
 
Three Steps to Making The Digital Workplace a Reality - by Intel’s Chad Const...
Three Steps to Making The Digital Workplace a Reality - by Intel’s Chad Const...Three Steps to Making The Digital Workplace a Reality - by Intel’s Chad Const...
Three Steps to Making The Digital Workplace a Reality - by Intel’s Chad Const...Intel IT Center
 
Intel® Xeon® Processor E7-8800/4800 v4 EAMG 2.0
Intel® Xeon® Processor E7-8800/4800 v4 EAMG 2.0Intel® Xeon® Processor E7-8800/4800 v4 EAMG 2.0
Intel® Xeon® Processor E7-8800/4800 v4 EAMG 2.0Intel IT Center
 
Intel® Xeon® Processor E5-2600 v4 Enterprise Database Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Enterprise Database Applications ShowcaseIntel® Xeon® Processor E5-2600 v4 Enterprise Database Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Enterprise Database Applications ShowcaseIntel IT Center
 
Intel® Xeon® Processor E5-2600 v4 Core Business Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Core Business Applications ShowcaseIntel® Xeon® Processor E5-2600 v4 Core Business Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Core Business Applications ShowcaseIntel IT Center
 
Intel® Xeon® Processor E5-2600 v4 Financial Security Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Financial Security Applications ShowcaseIntel® Xeon® Processor E5-2600 v4 Financial Security Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Financial Security Applications ShowcaseIntel IT Center
 
Intel® Xeon® Processor E5-2600 v4 Telco Cloud Digital Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Telco Cloud Digital Applications ShowcaseIntel® Xeon® Processor E5-2600 v4 Telco Cloud Digital Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Telco Cloud Digital Applications ShowcaseIntel IT Center
 
Intel® Xeon® Processor E5-2600 v4 Tech Computing Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Tech Computing Applications ShowcaseIntel® Xeon® Processor E5-2600 v4 Tech Computing Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Tech Computing Applications ShowcaseIntel IT Center
 
Intel® Xeon® Processor E5-2600 v4 Big Data Analytics Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Big Data Analytics Applications ShowcaseIntel® Xeon® Processor E5-2600 v4 Big Data Analytics Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Big Data Analytics Applications ShowcaseIntel IT Center
 
Intel® Xeon® Processor E5-2600 v4 Product Family EAMG
Intel® Xeon® Processor E5-2600 v4 Product Family EAMGIntel® Xeon® Processor E5-2600 v4 Product Family EAMG
Intel® Xeon® Processor E5-2600 v4 Product Family EAMGIntel IT Center
 
Gobblin for Data Analytics
Gobblin for Data AnalyticsGobblin for Data Analytics
Gobblin for Data AnalyticsIntel IT Center
 

Plus de Intel IT Center (20)

INFOGRAPHIC: Advantages of Intel vs. IBM Power on SAP HANA solutions
INFOGRAPHIC: Advantages of Intel vs. IBM Power on SAP HANA solutionsINFOGRAPHIC: Advantages of Intel vs. IBM Power on SAP HANA solutions
INFOGRAPHIC: Advantages of Intel vs. IBM Power on SAP HANA solutions
 
Disrupt Hackers With Robust User Authentication
Disrupt Hackers With Robust User AuthenticationDisrupt Hackers With Robust User Authentication
Disrupt Hackers With Robust User Authentication
 
Strengthen Your Enterprise Arsenal Against Cyber Attacks With Hardware-Enhanc...
Strengthen Your Enterprise Arsenal Against Cyber Attacks With Hardware-Enhanc...Strengthen Your Enterprise Arsenal Against Cyber Attacks With Hardware-Enhanc...
Strengthen Your Enterprise Arsenal Against Cyber Attacks With Hardware-Enhanc...
 
Harness Digital Disruption to Create 2022’s Workplace Today
Harness Digital Disruption to Create 2022’s Workplace TodayHarness Digital Disruption to Create 2022’s Workplace Today
Harness Digital Disruption to Create 2022’s Workplace Today
 
Don't Rely on Software Alone. Protect Endpoints with Hardware-Enhanced Security.
Don't Rely on Software Alone.Protect Endpoints with Hardware-Enhanced Security.Don't Rely on Software Alone.Protect Endpoints with Hardware-Enhanced Security.
Don't Rely on Software Alone. Protect Endpoints with Hardware-Enhanced Security.
 
Achieve Unconstrained Collaboration in a Digital World
Achieve Unconstrained Collaboration in a Digital WorldAchieve Unconstrained Collaboration in a Digital World
Achieve Unconstrained Collaboration in a Digital World
 
Intel® Xeon® Scalable Processors Enabled Applications Marketing Guide
Intel® Xeon® Scalable Processors Enabled Applications Marketing GuideIntel® Xeon® Scalable Processors Enabled Applications Marketing Guide
Intel® Xeon® Scalable Processors Enabled Applications Marketing Guide
 
#NABshow: National Association of Broadcasters 2017 Super Session Presentatio...
#NABshow: National Association of Broadcasters 2017 Super Session Presentatio...#NABshow: National Association of Broadcasters 2017 Super Session Presentatio...
#NABshow: National Association of Broadcasters 2017 Super Session Presentatio...
 
Identity Protection for the Digital Age
Identity Protection for the Digital AgeIdentity Protection for the Digital Age
Identity Protection for the Digital Age
 
Three Steps to Making a Digital Workplace a Reality
Three Steps to Making a Digital Workplace a RealityThree Steps to Making a Digital Workplace a Reality
Three Steps to Making a Digital Workplace a Reality
 
Three Steps to Making The Digital Workplace a Reality - by Intel’s Chad Const...
Three Steps to Making The Digital Workplace a Reality - by Intel’s Chad Const...Three Steps to Making The Digital Workplace a Reality - by Intel’s Chad Const...
Three Steps to Making The Digital Workplace a Reality - by Intel’s Chad Const...
 
Intel® Xeon® Processor E7-8800/4800 v4 EAMG 2.0
Intel® Xeon® Processor E7-8800/4800 v4 EAMG 2.0Intel® Xeon® Processor E7-8800/4800 v4 EAMG 2.0
Intel® Xeon® Processor E7-8800/4800 v4 EAMG 2.0
 
Intel® Xeon® Processor E5-2600 v4 Enterprise Database Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Enterprise Database Applications ShowcaseIntel® Xeon® Processor E5-2600 v4 Enterprise Database Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Enterprise Database Applications Showcase
 
Intel® Xeon® Processor E5-2600 v4 Core Business Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Core Business Applications ShowcaseIntel® Xeon® Processor E5-2600 v4 Core Business Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Core Business Applications Showcase
 
Intel® Xeon® Processor E5-2600 v4 Financial Security Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Financial Security Applications ShowcaseIntel® Xeon® Processor E5-2600 v4 Financial Security Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Financial Security Applications Showcase
 
Intel® Xeon® Processor E5-2600 v4 Telco Cloud Digital Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Telco Cloud Digital Applications ShowcaseIntel® Xeon® Processor E5-2600 v4 Telco Cloud Digital Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Telco Cloud Digital Applications Showcase
 
Intel® Xeon® Processor E5-2600 v4 Tech Computing Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Tech Computing Applications ShowcaseIntel® Xeon® Processor E5-2600 v4 Tech Computing Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Tech Computing Applications Showcase
 
Intel® Xeon® Processor E5-2600 v4 Big Data Analytics Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Big Data Analytics Applications ShowcaseIntel® Xeon® Processor E5-2600 v4 Big Data Analytics Applications Showcase
Intel® Xeon® Processor E5-2600 v4 Big Data Analytics Applications Showcase
 
Intel® Xeon® Processor E5-2600 v4 Product Family EAMG
Intel® Xeon® Processor E5-2600 v4 Product Family EAMGIntel® Xeon® Processor E5-2600 v4 Product Family EAMG
Intel® Xeon® Processor E5-2600 v4 Product Family EAMG
 
Gobblin for Data Analytics
Gobblin for Data AnalyticsGobblin for Data Analytics
Gobblin for Data Analytics
 

Dernier

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 

Dernier (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 

FPGA Inference - DellEMC SURFsara

  • 1. Superpowered to Low Power Deploying supercomputer-trained deep learning models for inference on Intel® FPGAs Lucas A. Wilson, PhD HPC & AI Engineering, Dell EMC @lucasawilson
  • 2. 2 Dell EMC / Intel / SURFSara Collaboration Lucas A. Wilson, Vineet Gundecha, and Alex Filby Valeriu Codreanu and Damian Podareanu Vikram Saletore and Shawn Slockers
  • 3. 3 Two Phases of Machine Learning Training Learn a suitable function approximation to correctly respond to the overwhelming majority of test cases Inference / Deployment Use learned function approximation to respond to new cases
  • 4. Training Deep Neural Networks to Identify Thoracic Pathologies
  • 5. 5 Superpowered Neural Network Training 386845 8319 16532 1742 1362 825 675 0 50000 100000 150000 200000 250000 300000 350000 400000 450000 DenseNet121, P=1, BZ=8 DenseNet121, P=64, BZ=64, GBZ=4096 VGG16, P=128, GBZ=8192 ResNet50, P=512, GBZ=4096 Resnet50, P=512, GBZ=8192 ResNet50, P=800, GBZ=8000 ResNet50, P=1024, GBZ=8192 TimetoSolution(seconds) 4.5 DAYS to reach a solution with DenseNet121 using 2 Intel® Xeon® Scalable Gold 6148 processors 11.25 MINUTES to reach a solution with ResNet50 using 512 Intel® Xeon® Scalable Gold 6148 processors! 573x FASTER total time to solution going from 1 to 256 Dell EMC PowerEdge C6420 nodes
  • 6. 6 The Price of Performance 256x Dell EMC PowerEdge C6420 Nodes ~450W* Intel® Xeon® Gold 6148+6148F *Measured median usage per node 675 sec. Runtime to Train Model 21.6 KwH Power to Train Radiologist Model
  • 7. 7 21.6 KwH of Electricity is Equivalent to… Emitting 20 lbs of CO2 by burning anthracite coal https://www.eia.gov/tools/faqs/faq.php?id=73&t=11 Keeping 360 60W Light Bulbs on for 1 hour Running a Hair Dryer for 18 hours Running a Student Cluster Challenge system for 7 hrs! *At maximum allowed power
  • 9. 9 Why Deploy Models Elsewhere? Remote Deployment Edge Deployment Power/Thermal Constrained Deployment
  • 10. 10 Deploying the trained model Dell EMC PowerEdge C6420 Intel® Xeon® Scalable Gold 6148+6148F Dell EMC PowerEdge R740 2x Intel® Xeon® Scalable Gold 6136 (300W) 4x Intel® PACs Dell EMC PowerEdge R640 2x Intel® Xeon® Scalable Gold 5118 (210W) 2x Intel® PACs
  • 11. 11 Preparing the Model for Inference* *according to the instructions Remove Horovod nodes from TensorFlow graph tensorflow/optimize_for_inference.py Apply checkpoint weights and generate binary file tensorflow/freeze_graph.py INTEL® OPENVINO™ Provide binary file to Intel® distribution of OpenVINO™ toolkit
  • 12. 12 Preparing the Model for Inference* *the way that actually worked INTEL® OPENVINO™ Save Checkpoint Files While Training Load weights into equivalent Keras model and generate TF checkpoint and protobuf Make sure to set training mode to False (0) Apply checkpoint weights and generate binary file tensorflow/freeze_graph.py Provide binary file to Intel® distribution of OpenVINO™ toolkit
  • 13. 13 Quantization FP16 ← FP32 FP11 ← FP32 Reduce precision to improve throughput and memory utilization
  • 14. 14 Flashing the Bitstream to Intel® PAC with Arria® 10 aocl program <target> <bitstream_dir>/2-0- 1_RC_FP11_ResNet50-101.aocx 4 targets available in test system: acl[0-3] Precision Topology Target device
  • 15. 15 Executing the Model on FPGA/CPU using Intel® distribution of OpenVINO™ toolkit Image Preprocessing Convolution/Activation/BatchNorm Classification 150,528B (224x224x3) Image data Pre-classification model output 114,744B (28,686 x 32b)
  • 17. 17 Power Consumption – Intel® PAC with Arria® 10 20 12 11 0 10 20 30 40 50 Power Consumption (W) Unprogrammed Programmed Streaming Data
  • 18. 18 183 183 183 183 183 41 43 170 411 583 87 105 420 325 745 0 100 200 300 400 500 600 700 800 900 0 100 200 300 400 500 600 700 800 Intel® PAC w/ Arria® 10, batch 1, FP11 Intel® PAC w/ Arria® 10, batch 96, FP11 4x Intel® PAC w/ Arria® 10, batch 384, FP11 Intel® Xeon® Gold 6136 (24 cores), batch 96, FP32 Intel® Xeon® Gold 6136 (23 cores) + 4x Intel® PAC w/ Arria® 10, batch 480, FP32/11 Watts Imagespersecond Idle Power Used Power Images/second Power and Performance – Dell EMC PowerEdge R740 Inference performance tests using ResNet-50 topology. FP11 precision. Image size is 224x224x3. Power measured via racadm. Intel® distribution of OpenVINO™ toolkit version R3 2.3x Throughput by using 4 Intel® FPGAs with 2 Intel® Xeon® Gold 6136
  • 19. 19 0.39 0.46 0.55 0.97 1.19 0 0.2 0.4 0.6 0.8 1 1.2 1.4 Intel® PAC w/ Arria® 10, batch 1, FP11 Intel® PAC w/ Arria® 10, batch 96, FP11 Intel® Xeon® Gold 6136 (24 cores), batch 96, FP32 Intel® Xeon® Gold 6136 (23 cores) + 4x Intel® PAC w/ Arria® 10, batch 480, FP32/11 4x Intel® PAC w/ Arria® 10, batch 384, FP11 Imagespersecondperwatt Power and Performance – Dell EMC PowerEdge R740 116% Efficiency Boost using 4x Intel® FPGAs vs 2x Intel® Xeon® Gold 6136 Inference performance tests using ResNet-50 topology. FP11 precision. Image size is 224x224x3. Power measured via racadm. Intel® distribution of OpenVINO™ toolkit version R3 76% Efficiency Boost using 4x Intel® FPGAs with 2x Intel® Xeon® Gold 6136
  • 20. 20 128 128 128 128 128 41 43 86 116 202 75 105 210 135 345 0 50 100 150 200 250 300 350 400 Intel® PAC w/ Arria® 10, batch 1, FP11 Intel® PAC w/ Arria® 10, batch 96, FP11 2x Intel® PAC w/ Arria® 10, batch 192, FP11 Intel® Xeon® Gold 5118 (24 cores), batch 96, FP32 Intel® Xeon ® Gold 5118 (23 cores) + 2x Intel® PAC w/ Arria® 10, batch 288, FP32/11 Idle Power Used Power Images/second Power and Performance – Dell EMC PowerEdge R640 Inference performance tests using ResNet-50 topology. FP11 precision. Image size is 224x224x3. Power measured via racadm. Intel® distribution of OpenVINO™ toolkit version R3 2.6x Throughput by using 2 Intel® FPGAs with 2 Intel® Xeon® Gold 5118
  • 21. 21 0.44 0.55 0.61 0.98 1.05 0 0.2 0.4 0.6 0.8 1 1.2 Intel® PAC w/Arria® 10, batch 1, FP11 Intel® Xeon Gold 5118 (24 cores), batch 96, FP32 Intel® PAC w/ Arria® 10, batch 96, FP11 2x Intel® PAC w/ Arria® 10, batch 192, FP11 Intel® Xeon ® Gold 5118 (23 cores) + 2x Intel® PAC w/ Arria® 10, batch 288, FP32/11 Imagespersecondperwatt Power and Performance – Dell EMC PowerEdge R640 91% Efficiency Boost adding 2x Intel® FPGAs to 2x Intel® Xeon® Gold 5118 78% Efficiency Boost using 2x Intel® FPGAs vs 2x Intel® Xeon® Gold 5118 Inference performance tests using ResNet-50 topology. FP11 precision. Image size is 224x224x3. Power measured via racadm. Intel® distribution of OpenVINO™ toolkit version R3
  • 22. 22

Notes de l'éditeur

  1. In addition to raw throughput, time-to-solution was improved dramatically. Because only the sequential DenseNet model converged to an acceptable accuracy, comparison is made to that runtime. ResNet50 runs were able to exceed generalized accuracy of the sequential DenseNet model, while being highly scalable.
  2. There are many reasons why an organization would want to deploy AI models at a location other than the data center where the model was trained. It may make more sense to deploy the model to a datacenter closer to where the model will be used The model may be intended for deployment
  3. During data streaming, PAC uses between 9-11 additional watts (41-43W total).
  4. 4 card projection is aggregation