SlideShare une entreprise Scribd logo
1  sur  47
Télécharger pour lire hors ligne
Boris Neiman – Sr. System Engineer
October 2017
The network part in accelerating Machine-Learning
and Big-Data
2- Mellanox Confidential -© 2017 Mellanox Technologies 2
Mellanox Overview
Company
Headquarters
Yokneam, Israel
Sunnyvale, California
Worldwide Offices
~2,900Employees worldwide
Ticker: MLNX
3- Mellanox Confidential -© 2017 Mellanox Technologies 3
Adapters Switches
Cables &
Transceivers
System on a Chip
Higher
Faster
Better
Data Speeds
Data Processing
Data Security
SmartNIC
Exponential Data Growth Everywhere
4- Mellanox Confidential -© 2017 Mellanox Technologies 4
Enabling the Future of Machine Learning Applications
HPC and Machine Learning Share Same Interconnect Needs
Storage
High
Performance
Computing
Financial
Embedded
Appliances
Database
Hyperscale
Machine
Learning
IoT
Healthcare
Manufacturing
Retail
Self-Driving
Vehicles
5- Mellanox Confidential -© 2017 Mellanox Technologies 5
Highest Performance 100 and 200Gb/s Interconnect Solutions
Transceivers
Active Optical and Copper Cables
(10 / 25 / 40 / 50 / 56 / 100 / 200Gb/s)
40 HDR (200Gb/s) Ports
80 HDR100 (100Gb/s) Ports
16Tb/s Throughput, 15.6 Billion msg/sec
Interconnect
Switch
Adapters
200Gb/s, 0.6us Latency
200 Million Messages per Second
(10 / 25 / 40 / 50 / 56 / 100 / 200Gb/s)
Switch
32 100GbE Ports, 64 25/50GbE Ports
(10 / 25 / 40 / 50 / 100GbE)
Throughput of 6.4Tb/s
Today’s Datacenters Need the Most Intelligent Interconnect
6- Mellanox Confidential -© 2017 Mellanox Technologies 6
X86
Open
POWER
GPU ARM FPGA
Leading Supplier of End-to-End Interconnect Solutions
Storage
Front /
Back-End
Server /
Compute
Switch /
Gateway
56/100/200G
InfiniBand
10/25/40/50/
100/200GbE
Virtual Protocol
Interconnect
56/100/200G
InfiniBand
10/25/40/50/
100/200GbE
Virtual Protocol
Interconnect
Smart Interconnect to Unleash The Power of All Compute Architectures
7- Mellanox Confidential -© 2017 Mellanox Technologies 7
What is Deep Learning?
 Also known as Deep Neural Network (DNN)
• Subset of Artificial Neural Network (ANN)
Deep Learning
Deep Learning is a subfield of machine learning concerned with
algorithms inspired by the structure and function of the brain called
artificial neural networks
Source: http://machinelearningmastery.com/what-is-deep-learning/
8- Mellanox Confidential -© 2017 Mellanox Technologies 8
Training & Inference: Two Phases of Deep Learning
Images
Video
Text
Speech
Tabular
Time Series
PBs of TRAINING
DATASET
NEW DATA
Billions of TFLOPS
Billions of FLOPS
Trained Model
Untrained Model
Big Data
Storage
TRAINING
• Large Scale-Out Cluster for Faster Training
• Faster access to Big Data storage
INFERENCE
• Highly transactional / supports many users
• Instant Response on ‘Trained’ Network
9- Mellanox Confidential -© 2017 Mellanox Technologies 9
Deep Learning Use Cases Across Industries
Finance, Fraud
& Insurance
• Fraud Detection
• Credit/Risk
Analysis
• High Frequency
Trading
Automotive &
Transportation
• Self Driving
• Image / Facial
Recognition
• Logistics &
Mapping
Medicine &
Genomics
• Drug Discovery
• Diagnostic
Assistance
• Cancer Cell
Detection
Oil
& Gas
• Seismic Imaging
• Reservoir
Characterization
• Subsurface
Fault Detection
Cloud, Web,
Mobile, Retail
• Image Tagging
• Speech
Recognition
• Sentiment
Analysis
Security &
Safety
• Surveillance
• Image Analysis
• Facial
Recognition
and Detection
10- Mellanox Confidential -© 2017 Mellanox Technologies 10
An Intelligent Network Unlocks the Power of Data with AI
60% Higher Return on Investment
Up to 50% Savings on Capital and Operation Expenses
World’s Highest Performance, Scalability and Productivity for AI
Mellanox Networking Unlocks the Power of AI with RDMA
Chainer
Cognitive Toolkit
11- Mellanox Confidential -© 2017 Mellanox Technologies 11
What Is RDMA?
 Remote Direct Memory Access (RDMA)
 Advance transport protocol (same layer as TCP and UDP)
 Main features
• Remote memory read/write semantics in addition to send/receive
• Kernel bypass / direct user space access
• Full hardware offload
• Secure, channel based IO
 Application advantage
• Low latency
• High bandwidth
• Low CPU consumption
 RoCE: RDMA over Converge Ethernet
• Available for all Ethernet speeds 10 – 100G
 Verbs: RDMA SW interface (equivalent to sockets)
RDMA TCP/UDP
Application
Network
Layer
Link
Layer
Physical
Layer
Application
Transport
Layer
Network
Layer
Link
Layer
Physical
Layer
Buffer
BufferBuffer
Transport
Layer
User code
Kernel code
Hardware
12- Mellanox Confidential -© 2017 Mellanox Technologies 12
Mellanox is Leading Artificial Intelligence (AI)
Health Care, Business Integrity, Business Intelligence
Knowledge Discovery, Security, Customer Support and more
Advancing Technology to Affect Science, Business, and Society
By Enabling Critical and Timely Decision Making
More Data Better Models Faster Interconnect
GPUs
CPUs
FPGAs
Storage
More Data → Faster Interconnect → Better Insight → Competitive Advantage
13- Mellanox Confidential -© 2017 Mellanox Technologies 13
Enabling Most Efficient Machine Learning Platforms (Examples)
Highest Performance, Scalability and Productivity for Deep Learning
14- Mellanox Confidential -© 2017 Mellanox Technologies 14
World’s First PCIe Gen 4
Public Cloud Server for
Cognitive Computing
Enabling Analytics
in Cloud
Sets TeraSort 2016
Benchmark Record
5x Faster , 3x Energy Efficient than 2015 Record
http://sortbenchmark.org/TencentSort2016.pdf
Smart Network for Azure
Cloud Server
Designed for Big Data Analytics & AI
Mellanox Accelerates Machine Learning and Big Data
15- Mellanox Confidential -© 2017 Mellanox Technologies 15
Mellanox Accelerates Machine Learning and Big Data
Big Sur & Big Basin
Facebook Open Source AI Hardware Platform
Only ONE Network of Choice - Mellanox
Powering Self Driving Car
2X Faster Training with Paddle Paddle
“…We rely on fast interconnect
technologies and RDMA.”
Andrew Ng, Chief Scientist, Baidu
https://code.facebook.com/posts/1687861518126048/facebook-to-open-source-ai-hardware-design
https://github.com/caffe2/caffe2/tree/master/caffe2/contrib/fbcollective/vendor/fbcollective
Real Time Fraud Detection
14 Million Transactions per Day
4 Billion Database Inserts
Image Recognition
~90% Prediction Accuracy
RDMA in Tensorflow and Caffe
Caffe
Caffe2
16- Mellanox Confidential -© 2017 Mellanox Technologies 16
Exponential Data Growth – The Need for Intelligent and Faster Interconnect
CPU-Centric (Onload) Data-Centric (Offload)
Faster Data Speeds and In-Network Computing Enable Higher Performance and Scale
Must Wait for the Data
Creates Performance Bottlenecks
Analyze Data as it Moves!
17- Mellanox Confidential -© 2017 Mellanox Technologies 17
Data Centric Architecture to Overcome Latency Bottlenecks
HPC / Machine Learning
Communications Latencies of 30-40us
HPC / Machine Learning
Communications Latencies of 3-4us
CPU-Centric (Onload) Data-Centric (Offload)
Network In-Network Computing
Intelligent Interconnect Paves the Road to Exascale Performance
18- Mellanox Confidential -© 2017 Mellanox Technologies 18
Mellanox Technology Accelerations for Machine Learning
GPU
GPU
CPUCPU
CPU
CPU
CPU
GPU
GPU
In-Network Computing Key for Highest Return on Investment
RDMA
GPUDirect
NVMe over
Fabrics
SHARP
Security
19- Mellanox Confidential -© 2017 Mellanox Technologies 19
In-Network Computing Enables Deep Learning Frameworks
CUDA
Mellanox Interconnect Solutions
Mellanox Accelerations for Machine Learning and Big Data
SHARP
Middleware (MPI, gRPC) - Optional
GPUDirect RDMA NVMe over FabricsrCUDA
20- Mellanox Confidential -© 2017 Mellanox Technologies 20
Mellanox SHARP for Gradient Computation
 CPU in a parameter server becomes the
bottleneck quickly (roughly 4 nodes)
 TCP adds a lot of overhead and the
traffic pattern is bursty
• SHARP performs the gradient averaging
• Removes the need for physical parameter server
• Removes all parameter server overhead
SHARP Provides Better Scalability and Reduced Network Traffic
21- Mellanox Confidential -© 2017 Mellanox Technologies 21
TensorFlow with Mellanox RDMA
Unmatched Linear Scalability, No Additional Cost
Up to 76% Efficiency and 50% Better Performance versus TCP
Reference Deployment Guide
22- Mellanox Confidential -© 2017 Mellanox Technologies 22
Accelerating Big Data Storage & Data Ingestion
23- Mellanox Confidential -© 2017 Mellanox Technologies 23
Data Ingestion
 Data ingestion is the process of acquiring and preparing the input
 Preprocessing stage before accessing machine learning frameworks
 Examples
• Convert file/image formats
• Combine multiple data sources
• Clean noise / enhance input
 Relevant for training and inference
 Data Ingestion typically includes
• Access to storage (local, distributed, network storage)
• Pre-processing in a big data framework
such as Hadoop or Spark
Accelerate Data Ingest is critical for machine learning performance
24- Mellanox Confidential -© 2017 Mellanox Technologies 24
Mellanox Accelerate Big Data - Enabling Real-time Decisions
Benchmark: TeraSort Benchmark: Cassandra Stress
0
200
400
600
800
1000
1200
Intel 10Gb/s Mellanox 10Gb/s Mellanox 40Gb/s
ExecutionTime(inseconds)
Ethernet Network
3X Faster
Benchmark: Fraud Detection
0
100
200
300
400
500
600
700
Existing Solution Aerospike with
Mellanox + Samsung
NVMe
TotalTransactionTime(inms)
CPU + Storage + Network Fraud Detection Algorithm
~2x more time for running
fraud detection algorithm
3X Faster Runtime! ~2X Faster Runtime! 25G BW for Database!
Mellanox is Certified by Leading Big Data Partners
25- Mellanox Confidential -© 2017 Mellanox Technologies 25
Spark Map-GroupBy: Extensive Data Exchange
Mapper
Data
shard
Mapper
Data
Shard
Mapper
Data
Shard
…
Merge
Sort
Reducer
Data
Shard
Merge
Sort
Reducer
Data
Shard
Merge
Sort
Reducer
Data
Shard
…
HDFS
Replication
HDFS
Replication
HDFS
Replication
All to All
Traffic
Pattern
RDMA
Significantly
Improves
Data
Exchange
Process
26- Mellanox Confidential -© 2017 Mellanox Technologies 26
17%
23%
31%
28%
0
20
40
60
80
100
120
140
160
Customer App #1 Customer App #2 HiBench TeraSort GroupBy
Runtime(inseconds)
TCP
RDMA
Apache Spark with Mellanox RDMA
Runtime samples Input Size Nodes Cores per node RAM per node Improvement
Customer App #1 5GB 14 24 85GB 17%
Customer App #2 540GB 14 24 85GB 23%
HiBench TeraSort 600GB 30 28 256GB 31%
GroupBy 48M Keys 15 28 256GB 28%
Lower is better
27- Mellanox Confidential -© 2017 Mellanox Technologies 27
Big Data and Deep Learning Solutions with HPE
HPE Mellanox Adapter & Switch Options
SN2100
HPE Reference Architecture:
Cloudera
Hortonworks
SAP HANA Vora & Spark
Elastic Platform for BDA
28- Mellanox Confidential -© 2017 Mellanox Technologies 28
Proven Advantages
 RDMA delivers 2X performance advantage over traditional TCP
 Machine Learning and HPC platforms share the same interconnect needs
 Scalable, flexible, high performance, high bandwidth, end-to-end connectivity
 Standards-based and supported by the largest eco-system
 Supports all compute architectures: x86, Power, ARM, GPU, FPGA etc.
 Native Offloading architecture
 RDMA, GPUDirect, SHARP and other core accelerations
 Backward and future compatible
Scalable Machine Learning Depends on Mellanox
Thank You
30- Mellanox Confidential -© 2017 Mellanox Technologies 30
Purpose-built for Acceleration of Deep Learning
PeerDirect™, GPUDirect® RDMA and ASYNC
31- Mellanox Confidential -© 2017 Mellanox Technologies 31
What is GPUDirect™
 Provides significant decrease in communication latency for acceleration devices
 Natively supported by Mellanox OFED
 Supports peer-to-peer communications between Mellanox adapters and third-party devices
 No unnecessary system memory copies & CPU overhead
 Enables GPUDirect™ RDMA, GPUDirect™ ASYNC, ROCm and others
 InfiniBand and Ethernet
CPU
Chip
set
Chipset
Vendor
Device
CPU
Chip
set
Chipset
Vendor
Device
0101001011
Designed for Deep Learning Acceleration
32- Mellanox Confidential -© 2017 Mellanox Technologies 32
GPUDirect™ RDMA and GPUDirect ASYNC™
Direct Connectivity GPU - Interconnect
33- Mellanox Confidential -© 2017 Mellanox Technologies 33
GPU-GPU Internode Latency
LowerisBetter
GPUDirect™ RDMA Performance
9.3X Better Latency
GPU-GPU Internode Bandwidth
HigherisBetter
10X Better Throughput
Source: Prof. DK Panda
9.3X
2.18 usec
10x
34- Mellanox Confidential -© 2017 Mellanox Technologies 34
NVIDIA® NCCL 2.0 Near-Linear Scalability
 Optimized collective communication library
• Allreduce, Reduce, Broadcast, Reduce-scatter, Allgather
 Inter-node communication using InfiniBand verbs and GPUDirect™ RDMA
 Multi-rail support, Topology detection
 50% performance improvement with NVIDIA® DGX-1™ across 32 NVIDIA Tesla® V100 GPUs
NVIDIA Accelerates Scalable Deep Learning with Mellanox
35- Mellanox Confidential -© 2017 Mellanox Technologies 35
Performance and Scalability Examples
36- Mellanox Confidential -© 2017 Mellanox Technologies 36
Accelerating TensorFlow™ with gRPC over RDMA
 Open source Machine Learning from Google
 Distributed training with gRPC framework
• Google’s Optimized RPC for distributed network
 RDMA Acceleration over UCX
• Unified Communication X (UCX)
• Integration with upstream TensorFlow
 2X higher Performance with RDMA
>2x Faster
Lower is better
~2X Acceleration for TensorFlow with RDMA
37- Mellanox Confidential -© 2017 Mellanox Technologies 37
TensorFlow™ over RDMA in Apache® Spark™ Environment
 Yahoo enhanced the TensorFlow C++ layer
to enable RDMA over InfiniBand
 InfiniBand provides faster connectivity and
supports accelerated offload capability
Source: http://yahoohadoop.tumblr.com/post/157196317141/open-sourcing-tensorflowonspark-distributed-deep
InfiniBand Provides Near Linear Scalability for Inception Model Training
38- Mellanox Confidential -© 2017 Mellanox Technologies 38
2X Acceleration for Baidu
 Machine Learning Software from Baidu
• Usage: word prediction, translation, image processing
 RDMA (GPUDirect) speeds training
• Lowers latency, increases throughput
• More cores for training
• Even better results with optimized RDMA
~2X Acceleration for Paddle Training with RDMA
39- Mellanox Confidential -© 2017 Mellanox Technologies 39
ChainerMN Depends on InfiniBand
 ChainerMN depends on MPI for inter-node communication
 NVIDIA® NCCL library is then used for intra-node communication between GPUs
 Leveraging InfiniBand results in near linear performance
 Mellanox InfiniBand allows ChainerMN to achieve ~72% accuracy.
Source: http://chainer.org/general/2017/02/08/Performance-of-Distributed-Deep-Learning-Using-ChainerMN.html
40- Mellanox Confidential -© 2017 Mellanox Technologies 40
Machine Learning Performance Comparison
InfiniBand Delivers 60% Better Performance with 2X Less Infrastructure
DeepBench measures the performance of
basic operations involved in training deep
neural networks.
60.3%
8 Accelerators
16 Accelerators
32 Accelerators
Lower is Better
41- Mellanox Confidential -© 2017 Mellanox Technologies 41
Scalable Deep Learning Depends on Mellanox
A Few Solution Examples
42- Mellanox Confidential -© 2017 Mellanox Technologies 42
HPE ProLiant XL270d
accelerator tray
HPE Apollo 6500 | Purpose Built for AI
Mellanox ConnectX-4 and ConnectX-5 Options Available
HPE Mellanox
Adapter Options
− 8 GP GPU @ 350W
− 4:1 or 8:1 GPU:CPU
− Latest network fabrics
− 2U
HPE Apollo d6500
chassis
Delivers efficiency and
tailor-made
configurability
43- Mellanox Confidential -© 2017 Mellanox Technologies 43
NVIDIA® DGX-1™ Deep Learning Server
Deep Learning Supercomputer in a Box
8 x NVIDIA® Tesla® P100 GPUs
5.3TFlops
16nm FinFET
NVLINK
4 x ConnectX®-4 EDR 100G InfiniBand
Adapters
NVIDIA® “SaturnV”
NVIDIA® Machine Learning Supercomputer
#28 on the Top500
3.3Pf with 124 DGX-1 nodes
#1 on the Green500
44- Mellanox Confidential -© 2017 Mellanox Technologies 44
NVIDIA® DGX-1™
 World’s first purpose-built system for deep
learning
• SaturnV is #28 on the Top500, 3.3Pf with 124 nodes
• SaturnV is also #1 on the Green500
 Fully integrated hardware
• 8x Tesla™ P100 (Pascal) w/16GB per GPU
• 28672 CUDA® Cores
• 4x ConnectX-4 EDR 100Gb/s HCAs
 Fully integrated software stack
• Major deep learning frameworks
• Drivers, NVIDIA CUDA, NVIDIA Deep Learning SDK
• GPUDirect™ RDMA
45- Mellanox Confidential -© 2017 Mellanox Technologies 45
Mellanox – Powering IBM Solutions
IBM offers Mellanox InfiniBand and Ethernet Solutions
Founding Members
32 Ports of 100GE
10G / 25G / 40G / 50G / 100G
46- Mellanox Confidential -© 2017 Mellanox Technologies 46
Big Sur – An Open Artificial Intelligence Platform (Facebook)
 An OCP based, GPU Artificial Intelligence Platform
 Flexible Architecture supporting up to 8 GPUs
• NVIDIA®, Intel®, AMD
 Mellanox Ethernet adapters
• RDMA supported (RoCE)
 Accelerating Artificial Intelligence
• Text Processing
• Language Modeling
• Computer Vision
https://code.facebook.com/posts/1687861518126048/facebook-to-open-source-ai-hardware-design/
Mellanox Intelligent Network for Intelligent Platform
47- Mellanox Confidential -© 2017 Mellanox Technologies 47
End-to-End Interconnect Solutions for All Platforms
Highest Performance and Scalability for
X86, Power, GPU, ARM and FPGA-based Compute and Storage Platforms
X86
Open
POWER
GPU ARM FPGA
Smart Interconnect to Unleash The Power of All Compute Architectures

Contenu connexe

Tendances

DDN: Protecting Your Data, Protecting Your Hardware
DDN: Protecting Your Data, Protecting Your HardwareDDN: Protecting Your Data, Protecting Your Hardware
DDN: Protecting Your Data, Protecting Your Hardwareinside-BigData.com
 
OpenPOWER/POWER9 Webinar from MIT and IBM
OpenPOWER/POWER9 Webinar from MIT and IBM OpenPOWER/POWER9 Webinar from MIT and IBM
OpenPOWER/POWER9 Webinar from MIT and IBM Ganesan Narayanasamy
 
Application Report: Big Data - Big Cluster Interconnects
Application Report: Big Data - Big Cluster InterconnectsApplication Report: Big Data - Big Cluster Interconnects
Application Report: Big Data - Big Cluster InterconnectsIT Brand Pulse
 
Covid-19 Response Capability with Power Systems
Covid-19 Response Capability with Power SystemsCovid-19 Response Capability with Power Systems
Covid-19 Response Capability with Power SystemsGanesan Narayanasamy
 
Ibm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bkIbm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bkIBM Switzerland
 
2016 August POWER Up Your Insights - IBM System Summit Mumbai
2016 August POWER Up Your Insights - IBM System Summit Mumbai2016 August POWER Up Your Insights - IBM System Summit Mumbai
2016 August POWER Up Your Insights - IBM System Summit MumbaiAnand Haridass
 
Computação de Alto Desempenho - Fator chave para a competitividade do País, d...
Computação de Alto Desempenho - Fator chave para a competitividade do País, d...Computação de Alto Desempenho - Fator chave para a competitividade do País, d...
Computação de Alto Desempenho - Fator chave para a competitividade do País, d...Igor José F. Freitas
 
IMCSummit 2015 - Day 2 Keynote - In-Memory Computing and the Emergence of Tie...
IMCSummit 2015 - Day 2 Keynote - In-Memory Computing and the Emergence of Tie...IMCSummit 2015 - Day 2 Keynote - In-Memory Computing and the Emergence of Tie...
IMCSummit 2015 - Day 2 Keynote - In-Memory Computing and the Emergence of Tie...In-Memory Computing Summit
 
Intel and Red Hat: Enhancing OpenStack for Enterprise Deployment
Intel and Red Hat: Enhancing OpenStack for Enterprise DeploymentIntel and Red Hat: Enhancing OpenStack for Enterprise Deployment
Intel and Red Hat: Enhancing OpenStack for Enterprise DeploymentIntel® Software
 
New Benchmark for High-end Storage - Huawei OceanStor 18000 V3 High-end Storage
New Benchmark for High-end Storage - Huawei OceanStor 18000 V3 High-end StorageNew Benchmark for High-end Storage - Huawei OceanStor 18000 V3 High-end Storage
New Benchmark for High-end Storage - Huawei OceanStor 18000 V3 High-end StorageHuawei Enterprise Hong Kong
 
Tendencias Storage
Tendencias StorageTendencias Storage
Tendencias StorageFran Navarro
 
OpenPOWER Foundation Overview
OpenPOWER Foundation OverviewOpenPOWER Foundation Overview
OpenPOWER Foundation OverviewNVIDIA Taiwan
 
Optimizing Lustre and GPFS with DDN
Optimizing Lustre and GPFS with DDNOptimizing Lustre and GPFS with DDN
Optimizing Lustre and GPFS with DDNinside-BigData.com
 
Big data oracle_introduccion
Big data oracle_introduccionBig data oracle_introduccion
Big data oracle_introduccionFran Navarro
 
Co-Design Architecture for Exascale
Co-Design Architecture for ExascaleCo-Design Architecture for Exascale
Co-Design Architecture for Exascaleinside-BigData.com
 
Huawei Powers Efficient and Scalable HPC
Huawei Powers Efficient and Scalable HPCHuawei Powers Efficient and Scalable HPC
Huawei Powers Efficient and Scalable HPCinside-BigData.com
 
Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...
Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...
Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...Red_Hat_Storage
 

Tendances (20)

OpenPOWER/POWER9 AI webinar
OpenPOWER/POWER9 AI webinar OpenPOWER/POWER9 AI webinar
OpenPOWER/POWER9 AI webinar
 
DDN: Protecting Your Data, Protecting Your Hardware
DDN: Protecting Your Data, Protecting Your HardwareDDN: Protecting Your Data, Protecting Your Hardware
DDN: Protecting Your Data, Protecting Your Hardware
 
OpenPOWER/POWER9 Webinar from MIT and IBM
OpenPOWER/POWER9 Webinar from MIT and IBM OpenPOWER/POWER9 Webinar from MIT and IBM
OpenPOWER/POWER9 Webinar from MIT and IBM
 
Mellanox OpenPOWER features
Mellanox OpenPOWER featuresMellanox OpenPOWER features
Mellanox OpenPOWER features
 
Application Report: Big Data - Big Cluster Interconnects
Application Report: Big Data - Big Cluster InterconnectsApplication Report: Big Data - Big Cluster Interconnects
Application Report: Big Data - Big Cluster Interconnects
 
Covid-19 Response Capability with Power Systems
Covid-19 Response Capability with Power SystemsCovid-19 Response Capability with Power Systems
Covid-19 Response Capability with Power Systems
 
Ibm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bkIbm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bk
 
2016 August POWER Up Your Insights - IBM System Summit Mumbai
2016 August POWER Up Your Insights - IBM System Summit Mumbai2016 August POWER Up Your Insights - IBM System Summit Mumbai
2016 August POWER Up Your Insights - IBM System Summit Mumbai
 
Computação de Alto Desempenho - Fator chave para a competitividade do País, d...
Computação de Alto Desempenho - Fator chave para a competitividade do País, d...Computação de Alto Desempenho - Fator chave para a competitividade do País, d...
Computação de Alto Desempenho - Fator chave para a competitividade do País, d...
 
IMCSummit 2015 - Day 2 Keynote - In-Memory Computing and the Emergence of Tie...
IMCSummit 2015 - Day 2 Keynote - In-Memory Computing and the Emergence of Tie...IMCSummit 2015 - Day 2 Keynote - In-Memory Computing and the Emergence of Tie...
IMCSummit 2015 - Day 2 Keynote - In-Memory Computing and the Emergence of Tie...
 
Intel and Red Hat: Enhancing OpenStack for Enterprise Deployment
Intel and Red Hat: Enhancing OpenStack for Enterprise DeploymentIntel and Red Hat: Enhancing OpenStack for Enterprise Deployment
Intel and Red Hat: Enhancing OpenStack for Enterprise Deployment
 
New Benchmark for High-end Storage - Huawei OceanStor 18000 V3 High-end Storage
New Benchmark for High-end Storage - Huawei OceanStor 18000 V3 High-end StorageNew Benchmark for High-end Storage - Huawei OceanStor 18000 V3 High-end Storage
New Benchmark for High-end Storage - Huawei OceanStor 18000 V3 High-end Storage
 
Tendencias Storage
Tendencias StorageTendencias Storage
Tendencias Storage
 
OpenPOWER Foundation Overview
OpenPOWER Foundation OverviewOpenPOWER Foundation Overview
OpenPOWER Foundation Overview
 
Optimizing Lustre and GPFS with DDN
Optimizing Lustre and GPFS with DDNOptimizing Lustre and GPFS with DDN
Optimizing Lustre and GPFS with DDN
 
Big data oracle_introduccion
Big data oracle_introduccionBig data oracle_introduccion
Big data oracle_introduccion
 
Co-Design Architecture for Exascale
Co-Design Architecture for ExascaleCo-Design Architecture for Exascale
Co-Design Architecture for Exascale
 
Huawei Powers Efficient and Scalable HPC
Huawei Powers Efficient and Scalable HPCHuawei Powers Efficient and Scalable HPC
Huawei Powers Efficient and Scalable HPC
 
FPGAs and Machine Learning
FPGAs and Machine LearningFPGAs and Machine Learning
FPGAs and Machine Learning
 
Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...
Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...
Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...
 

Similaire à HPC DAY 2017 | The network part in accelerating Machine-Learning and Big-Data

Storage, Cloud, Web 2.0, Big Data Driving Growth
Storage, Cloud, Web 2.0, Big Data Driving GrowthStorage, Cloud, Web 2.0, Big Data Driving Growth
Storage, Cloud, Web 2.0, Big Data Driving GrowthMellanox Technologies
 
Announcing the Mellanox ConnectX-5 100G InfiniBand Adapter
Announcing the Mellanox ConnectX-5 100G InfiniBand AdapterAnnouncing the Mellanox ConnectX-5 100G InfiniBand Adapter
Announcing the Mellanox ConnectX-5 100G InfiniBand Adapterinside-BigData.com
 
Innovating to Create a Brighter Future for AI, HPC, and Big Data
Innovating to Create a Brighter Future for AI, HPC, and Big DataInnovating to Create a Brighter Future for AI, HPC, and Big Data
Innovating to Create a Brighter Future for AI, HPC, and Big Datainside-BigData.com
 
IBM Aspera for the CTO / CIO
IBM Aspera for the CTO / CIOIBM Aspera for the CTO / CIO
IBM Aspera for the CTO / CIOChris Shaw
 
Phoenix Data Conference - Big Data Analytics for IoT 11/4/17
Phoenix Data Conference - Big Data Analytics for IoT 11/4/17Phoenix Data Conference - Big Data Analytics for IoT 11/4/17
Phoenix Data Conference - Big Data Analytics for IoT 11/4/17Mark Goldstein
 
th1330-1410effectenbeurszaal4-3v2-140424180955-phpapp01 (1).pdf
th1330-1410effectenbeurszaal4-3v2-140424180955-phpapp01 (1).pdfth1330-1410effectenbeurszaal4-3v2-140424180955-phpapp01 (1).pdf
th1330-1410effectenbeurszaal4-3v2-140424180955-phpapp01 (1).pdfTarekHassan840678
 
Monitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service ProvidersMonitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service ProvidersDataWorks Summit
 
Monetizing Big Data at Telecom Service Providers
Monetizing Big Data at Telecom Service ProvidersMonetizing Big Data at Telecom Service Providers
Monetizing Big Data at Telecom Service ProvidersDataWorks Summit
 
IBM Data Centric Systems & OpenPOWER
IBM Data Centric Systems & OpenPOWERIBM Data Centric Systems & OpenPOWER
IBM Data Centric Systems & OpenPOWERinside-BigData.com
 
Conquering Disaster Recovery Challenges and Out-of-Control Data with the Hybr...
Conquering Disaster Recovery Challenges and Out-of-Control Data with the Hybr...Conquering Disaster Recovery Challenges and Out-of-Control Data with the Hybr...
Conquering Disaster Recovery Challenges and Out-of-Control Data with the Hybr...actualtechmedia
 
Apache Hadoop India Summit 2011 Keynote talk "Exploring the Future IT Infrast...
Apache Hadoop India Summit 2011 Keynote talk "Exploring the Future IT Infrast...Apache Hadoop India Summit 2011 Keynote talk "Exploring the Future IT Infrast...
Apache Hadoop India Summit 2011 Keynote talk "Exploring the Future IT Infrast...Yahoo Developer Network
 
Benefits of an Agile Data Fabric for Business Intelligence
Benefits of an Agile Data Fabric for Business IntelligenceBenefits of an Agile Data Fabric for Business Intelligence
Benefits of an Agile Data Fabric for Business IntelligenceDataWorks Summit/Hadoop Summit
 
Interconnect Your Future With Mellanox
Interconnect Your Future With MellanoxInterconnect Your Future With Mellanox
Interconnect Your Future With MellanoxMellanox Technologies
 
Proposte ORACLE per la modernizzazione dello sviluppo applicativo
Proposte ORACLE per la modernizzazione dello sviluppo applicativoProposte ORACLE per la modernizzazione dello sviluppo applicativo
Proposte ORACLE per la modernizzazione dello sviluppo applicativoJürgen Ambrosi
 
Cloud computing - dien toan dam may
Cloud computing - dien toan dam mayCloud computing - dien toan dam may
Cloud computing - dien toan dam mayNguyen Duong
 
A New Approach to Continuous Monitoring in the Cloud
A New Approach to Continuous Monitoring in the CloudA New Approach to Continuous Monitoring in the Cloud
A New Approach to Continuous Monitoring in the CloudNETSCOUT
 
A Journey to the Cloud with Data Virtualization
A Journey to the Cloud with Data VirtualizationA Journey to the Cloud with Data Virtualization
A Journey to the Cloud with Data VirtualizationDenodo
 

Similaire à HPC DAY 2017 | The network part in accelerating Machine-Learning and Big-Data (20)

Storage, Cloud, Web 2.0, Big Data Driving Growth
Storage, Cloud, Web 2.0, Big Data Driving GrowthStorage, Cloud, Web 2.0, Big Data Driving Growth
Storage, Cloud, Web 2.0, Big Data Driving Growth
 
Announcing the Mellanox ConnectX-5 100G InfiniBand Adapter
Announcing the Mellanox ConnectX-5 100G InfiniBand AdapterAnnouncing the Mellanox ConnectX-5 100G InfiniBand Adapter
Announcing the Mellanox ConnectX-5 100G InfiniBand Adapter
 
Innovating to Create a Brighter Future for AI, HPC, and Big Data
Innovating to Create a Brighter Future for AI, HPC, and Big DataInnovating to Create a Brighter Future for AI, HPC, and Big Data
Innovating to Create a Brighter Future for AI, HPC, and Big Data
 
IBM Aspera for the CTO / CIO
IBM Aspera for the CTO / CIOIBM Aspera for the CTO / CIO
IBM Aspera for the CTO / CIO
 
Phoenix Data Conference - Big Data Analytics for IoT 11/4/17
Phoenix Data Conference - Big Data Analytics for IoT 11/4/17Phoenix Data Conference - Big Data Analytics for IoT 11/4/17
Phoenix Data Conference - Big Data Analytics for IoT 11/4/17
 
th1330-1410effectenbeurszaal4-3v2-140424180955-phpapp01 (1).pdf
th1330-1410effectenbeurszaal4-3v2-140424180955-phpapp01 (1).pdfth1330-1410effectenbeurszaal4-3v2-140424180955-phpapp01 (1).pdf
th1330-1410effectenbeurszaal4-3v2-140424180955-phpapp01 (1).pdf
 
Monitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service ProvidersMonitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service Providers
 
Monetizing Big Data at Telecom Service Providers
Monetizing Big Data at Telecom Service ProvidersMonetizing Big Data at Telecom Service Providers
Monetizing Big Data at Telecom Service Providers
 
IBM Aspera overview
IBM Aspera overview IBM Aspera overview
IBM Aspera overview
 
IBM Data Centric Systems & OpenPOWER
IBM Data Centric Systems & OpenPOWERIBM Data Centric Systems & OpenPOWER
IBM Data Centric Systems & OpenPOWER
 
Mellanox 2013 Analyst Day
Mellanox 2013 Analyst DayMellanox 2013 Analyst Day
Mellanox 2013 Analyst Day
 
Mellanox IBM
Mellanox IBMMellanox IBM
Mellanox IBM
 
Conquering Disaster Recovery Challenges and Out-of-Control Data with the Hybr...
Conquering Disaster Recovery Challenges and Out-of-Control Data with the Hybr...Conquering Disaster Recovery Challenges and Out-of-Control Data with the Hybr...
Conquering Disaster Recovery Challenges and Out-of-Control Data with the Hybr...
 
Apache Hadoop India Summit 2011 Keynote talk "Exploring the Future IT Infrast...
Apache Hadoop India Summit 2011 Keynote talk "Exploring the Future IT Infrast...Apache Hadoop India Summit 2011 Keynote talk "Exploring the Future IT Infrast...
Apache Hadoop India Summit 2011 Keynote talk "Exploring the Future IT Infrast...
 
Benefits of an Agile Data Fabric for Business Intelligence
Benefits of an Agile Data Fabric for Business IntelligenceBenefits of an Agile Data Fabric for Business Intelligence
Benefits of an Agile Data Fabric for Business Intelligence
 
Interconnect Your Future With Mellanox
Interconnect Your Future With MellanoxInterconnect Your Future With Mellanox
Interconnect Your Future With Mellanox
 
Proposte ORACLE per la modernizzazione dello sviluppo applicativo
Proposte ORACLE per la modernizzazione dello sviluppo applicativoProposte ORACLE per la modernizzazione dello sviluppo applicativo
Proposte ORACLE per la modernizzazione dello sviluppo applicativo
 
Cloud computing - dien toan dam may
Cloud computing - dien toan dam mayCloud computing - dien toan dam may
Cloud computing - dien toan dam may
 
A New Approach to Continuous Monitoring in the Cloud
A New Approach to Continuous Monitoring in the CloudA New Approach to Continuous Monitoring in the Cloud
A New Approach to Continuous Monitoring in the Cloud
 
A Journey to the Cloud with Data Virtualization
A Journey to the Cloud with Data VirtualizationA Journey to the Cloud with Data Virtualization
A Journey to the Cloud with Data Virtualization
 

Dernier

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Principled Technologies
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 

Dernier (20)

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 

HPC DAY 2017 | The network part in accelerating Machine-Learning and Big-Data

  • 1. Boris Neiman – Sr. System Engineer October 2017 The network part in accelerating Machine-Learning and Big-Data
  • 2. 2- Mellanox Confidential -© 2017 Mellanox Technologies 2 Mellanox Overview Company Headquarters Yokneam, Israel Sunnyvale, California Worldwide Offices ~2,900Employees worldwide Ticker: MLNX
  • 3. 3- Mellanox Confidential -© 2017 Mellanox Technologies 3 Adapters Switches Cables & Transceivers System on a Chip Higher Faster Better Data Speeds Data Processing Data Security SmartNIC Exponential Data Growth Everywhere
  • 4. 4- Mellanox Confidential -© 2017 Mellanox Technologies 4 Enabling the Future of Machine Learning Applications HPC and Machine Learning Share Same Interconnect Needs Storage High Performance Computing Financial Embedded Appliances Database Hyperscale Machine Learning IoT Healthcare Manufacturing Retail Self-Driving Vehicles
  • 5. 5- Mellanox Confidential -© 2017 Mellanox Technologies 5 Highest Performance 100 and 200Gb/s Interconnect Solutions Transceivers Active Optical and Copper Cables (10 / 25 / 40 / 50 / 56 / 100 / 200Gb/s) 40 HDR (200Gb/s) Ports 80 HDR100 (100Gb/s) Ports 16Tb/s Throughput, 15.6 Billion msg/sec Interconnect Switch Adapters 200Gb/s, 0.6us Latency 200 Million Messages per Second (10 / 25 / 40 / 50 / 56 / 100 / 200Gb/s) Switch 32 100GbE Ports, 64 25/50GbE Ports (10 / 25 / 40 / 50 / 100GbE) Throughput of 6.4Tb/s Today’s Datacenters Need the Most Intelligent Interconnect
  • 6. 6- Mellanox Confidential -© 2017 Mellanox Technologies 6 X86 Open POWER GPU ARM FPGA Leading Supplier of End-to-End Interconnect Solutions Storage Front / Back-End Server / Compute Switch / Gateway 56/100/200G InfiniBand 10/25/40/50/ 100/200GbE Virtual Protocol Interconnect 56/100/200G InfiniBand 10/25/40/50/ 100/200GbE Virtual Protocol Interconnect Smart Interconnect to Unleash The Power of All Compute Architectures
  • 7. 7- Mellanox Confidential -© 2017 Mellanox Technologies 7 What is Deep Learning?  Also known as Deep Neural Network (DNN) • Subset of Artificial Neural Network (ANN) Deep Learning Deep Learning is a subfield of machine learning concerned with algorithms inspired by the structure and function of the brain called artificial neural networks Source: http://machinelearningmastery.com/what-is-deep-learning/
  • 8. 8- Mellanox Confidential -© 2017 Mellanox Technologies 8 Training & Inference: Two Phases of Deep Learning Images Video Text Speech Tabular Time Series PBs of TRAINING DATASET NEW DATA Billions of TFLOPS Billions of FLOPS Trained Model Untrained Model Big Data Storage TRAINING • Large Scale-Out Cluster for Faster Training • Faster access to Big Data storage INFERENCE • Highly transactional / supports many users • Instant Response on ‘Trained’ Network
  • 9. 9- Mellanox Confidential -© 2017 Mellanox Technologies 9 Deep Learning Use Cases Across Industries Finance, Fraud & Insurance • Fraud Detection • Credit/Risk Analysis • High Frequency Trading Automotive & Transportation • Self Driving • Image / Facial Recognition • Logistics & Mapping Medicine & Genomics • Drug Discovery • Diagnostic Assistance • Cancer Cell Detection Oil & Gas • Seismic Imaging • Reservoir Characterization • Subsurface Fault Detection Cloud, Web, Mobile, Retail • Image Tagging • Speech Recognition • Sentiment Analysis Security & Safety • Surveillance • Image Analysis • Facial Recognition and Detection
  • 10. 10- Mellanox Confidential -© 2017 Mellanox Technologies 10 An Intelligent Network Unlocks the Power of Data with AI 60% Higher Return on Investment Up to 50% Savings on Capital and Operation Expenses World’s Highest Performance, Scalability and Productivity for AI Mellanox Networking Unlocks the Power of AI with RDMA Chainer Cognitive Toolkit
  • 11. 11- Mellanox Confidential -© 2017 Mellanox Technologies 11 What Is RDMA?  Remote Direct Memory Access (RDMA)  Advance transport protocol (same layer as TCP and UDP)  Main features • Remote memory read/write semantics in addition to send/receive • Kernel bypass / direct user space access • Full hardware offload • Secure, channel based IO  Application advantage • Low latency • High bandwidth • Low CPU consumption  RoCE: RDMA over Converge Ethernet • Available for all Ethernet speeds 10 – 100G  Verbs: RDMA SW interface (equivalent to sockets) RDMA TCP/UDP Application Network Layer Link Layer Physical Layer Application Transport Layer Network Layer Link Layer Physical Layer Buffer BufferBuffer Transport Layer User code Kernel code Hardware
  • 12. 12- Mellanox Confidential -© 2017 Mellanox Technologies 12 Mellanox is Leading Artificial Intelligence (AI) Health Care, Business Integrity, Business Intelligence Knowledge Discovery, Security, Customer Support and more Advancing Technology to Affect Science, Business, and Society By Enabling Critical and Timely Decision Making More Data Better Models Faster Interconnect GPUs CPUs FPGAs Storage More Data → Faster Interconnect → Better Insight → Competitive Advantage
  • 13. 13- Mellanox Confidential -© 2017 Mellanox Technologies 13 Enabling Most Efficient Machine Learning Platforms (Examples) Highest Performance, Scalability and Productivity for Deep Learning
  • 14. 14- Mellanox Confidential -© 2017 Mellanox Technologies 14 World’s First PCIe Gen 4 Public Cloud Server for Cognitive Computing Enabling Analytics in Cloud Sets TeraSort 2016 Benchmark Record 5x Faster , 3x Energy Efficient than 2015 Record http://sortbenchmark.org/TencentSort2016.pdf Smart Network for Azure Cloud Server Designed for Big Data Analytics & AI Mellanox Accelerates Machine Learning and Big Data
  • 15. 15- Mellanox Confidential -© 2017 Mellanox Technologies 15 Mellanox Accelerates Machine Learning and Big Data Big Sur & Big Basin Facebook Open Source AI Hardware Platform Only ONE Network of Choice - Mellanox Powering Self Driving Car 2X Faster Training with Paddle Paddle “…We rely on fast interconnect technologies and RDMA.” Andrew Ng, Chief Scientist, Baidu https://code.facebook.com/posts/1687861518126048/facebook-to-open-source-ai-hardware-design https://github.com/caffe2/caffe2/tree/master/caffe2/contrib/fbcollective/vendor/fbcollective Real Time Fraud Detection 14 Million Transactions per Day 4 Billion Database Inserts Image Recognition ~90% Prediction Accuracy RDMA in Tensorflow and Caffe Caffe Caffe2
  • 16. 16- Mellanox Confidential -© 2017 Mellanox Technologies 16 Exponential Data Growth – The Need for Intelligent and Faster Interconnect CPU-Centric (Onload) Data-Centric (Offload) Faster Data Speeds and In-Network Computing Enable Higher Performance and Scale Must Wait for the Data Creates Performance Bottlenecks Analyze Data as it Moves!
  • 17. 17- Mellanox Confidential -© 2017 Mellanox Technologies 17 Data Centric Architecture to Overcome Latency Bottlenecks HPC / Machine Learning Communications Latencies of 30-40us HPC / Machine Learning Communications Latencies of 3-4us CPU-Centric (Onload) Data-Centric (Offload) Network In-Network Computing Intelligent Interconnect Paves the Road to Exascale Performance
  • 18. 18- Mellanox Confidential -© 2017 Mellanox Technologies 18 Mellanox Technology Accelerations for Machine Learning GPU GPU CPUCPU CPU CPU CPU GPU GPU In-Network Computing Key for Highest Return on Investment RDMA GPUDirect NVMe over Fabrics SHARP Security
  • 19. 19- Mellanox Confidential -© 2017 Mellanox Technologies 19 In-Network Computing Enables Deep Learning Frameworks CUDA Mellanox Interconnect Solutions Mellanox Accelerations for Machine Learning and Big Data SHARP Middleware (MPI, gRPC) - Optional GPUDirect RDMA NVMe over FabricsrCUDA
  • 20. 20- Mellanox Confidential -© 2017 Mellanox Technologies 20 Mellanox SHARP for Gradient Computation  CPU in a parameter server becomes the bottleneck quickly (roughly 4 nodes)  TCP adds a lot of overhead and the traffic pattern is bursty • SHARP performs the gradient averaging • Removes the need for physical parameter server • Removes all parameter server overhead SHARP Provides Better Scalability and Reduced Network Traffic
  • 21. 21- Mellanox Confidential -© 2017 Mellanox Technologies 21 TensorFlow with Mellanox RDMA Unmatched Linear Scalability, No Additional Cost Up to 76% Efficiency and 50% Better Performance versus TCP Reference Deployment Guide
  • 22. 22- Mellanox Confidential -© 2017 Mellanox Technologies 22 Accelerating Big Data Storage & Data Ingestion
  • 23. 23- Mellanox Confidential -© 2017 Mellanox Technologies 23 Data Ingestion  Data ingestion is the process of acquiring and preparing the input  Preprocessing stage before accessing machine learning frameworks  Examples • Convert file/image formats • Combine multiple data sources • Clean noise / enhance input  Relevant for training and inference  Data Ingestion typically includes • Access to storage (local, distributed, network storage) • Pre-processing in a big data framework such as Hadoop or Spark Accelerate Data Ingest is critical for machine learning performance
  • 24. 24- Mellanox Confidential -© 2017 Mellanox Technologies 24 Mellanox Accelerate Big Data - Enabling Real-time Decisions Benchmark: TeraSort Benchmark: Cassandra Stress 0 200 400 600 800 1000 1200 Intel 10Gb/s Mellanox 10Gb/s Mellanox 40Gb/s ExecutionTime(inseconds) Ethernet Network 3X Faster Benchmark: Fraud Detection 0 100 200 300 400 500 600 700 Existing Solution Aerospike with Mellanox + Samsung NVMe TotalTransactionTime(inms) CPU + Storage + Network Fraud Detection Algorithm ~2x more time for running fraud detection algorithm 3X Faster Runtime! ~2X Faster Runtime! 25G BW for Database! Mellanox is Certified by Leading Big Data Partners
  • 25. 25- Mellanox Confidential -© 2017 Mellanox Technologies 25 Spark Map-GroupBy: Extensive Data Exchange Mapper Data shard Mapper Data Shard Mapper Data Shard … Merge Sort Reducer Data Shard Merge Sort Reducer Data Shard Merge Sort Reducer Data Shard … HDFS Replication HDFS Replication HDFS Replication All to All Traffic Pattern RDMA Significantly Improves Data Exchange Process
  • 26. 26- Mellanox Confidential -© 2017 Mellanox Technologies 26 17% 23% 31% 28% 0 20 40 60 80 100 120 140 160 Customer App #1 Customer App #2 HiBench TeraSort GroupBy Runtime(inseconds) TCP RDMA Apache Spark with Mellanox RDMA Runtime samples Input Size Nodes Cores per node RAM per node Improvement Customer App #1 5GB 14 24 85GB 17% Customer App #2 540GB 14 24 85GB 23% HiBench TeraSort 600GB 30 28 256GB 31% GroupBy 48M Keys 15 28 256GB 28% Lower is better
  • 27. 27- Mellanox Confidential -© 2017 Mellanox Technologies 27 Big Data and Deep Learning Solutions with HPE HPE Mellanox Adapter & Switch Options SN2100 HPE Reference Architecture: Cloudera Hortonworks SAP HANA Vora & Spark Elastic Platform for BDA
  • 28. 28- Mellanox Confidential -© 2017 Mellanox Technologies 28 Proven Advantages  RDMA delivers 2X performance advantage over traditional TCP  Machine Learning and HPC platforms share the same interconnect needs  Scalable, flexible, high performance, high bandwidth, end-to-end connectivity  Standards-based and supported by the largest eco-system  Supports all compute architectures: x86, Power, ARM, GPU, FPGA etc.  Native Offloading architecture  RDMA, GPUDirect, SHARP and other core accelerations  Backward and future compatible Scalable Machine Learning Depends on Mellanox
  • 30. 30- Mellanox Confidential -© 2017 Mellanox Technologies 30 Purpose-built for Acceleration of Deep Learning PeerDirect™, GPUDirect® RDMA and ASYNC
  • 31. 31- Mellanox Confidential -© 2017 Mellanox Technologies 31 What is GPUDirect™  Provides significant decrease in communication latency for acceleration devices  Natively supported by Mellanox OFED  Supports peer-to-peer communications between Mellanox adapters and third-party devices  No unnecessary system memory copies & CPU overhead  Enables GPUDirect™ RDMA, GPUDirect™ ASYNC, ROCm and others  InfiniBand and Ethernet CPU Chip set Chipset Vendor Device CPU Chip set Chipset Vendor Device 0101001011 Designed for Deep Learning Acceleration
  • 32. 32- Mellanox Confidential -© 2017 Mellanox Technologies 32 GPUDirect™ RDMA and GPUDirect ASYNC™ Direct Connectivity GPU - Interconnect
  • 33. 33- Mellanox Confidential -© 2017 Mellanox Technologies 33 GPU-GPU Internode Latency LowerisBetter GPUDirect™ RDMA Performance 9.3X Better Latency GPU-GPU Internode Bandwidth HigherisBetter 10X Better Throughput Source: Prof. DK Panda 9.3X 2.18 usec 10x
  • 34. 34- Mellanox Confidential -© 2017 Mellanox Technologies 34 NVIDIA® NCCL 2.0 Near-Linear Scalability  Optimized collective communication library • Allreduce, Reduce, Broadcast, Reduce-scatter, Allgather  Inter-node communication using InfiniBand verbs and GPUDirect™ RDMA  Multi-rail support, Topology detection  50% performance improvement with NVIDIA® DGX-1™ across 32 NVIDIA Tesla® V100 GPUs NVIDIA Accelerates Scalable Deep Learning with Mellanox
  • 35. 35- Mellanox Confidential -© 2017 Mellanox Technologies 35 Performance and Scalability Examples
  • 36. 36- Mellanox Confidential -© 2017 Mellanox Technologies 36 Accelerating TensorFlow™ with gRPC over RDMA  Open source Machine Learning from Google  Distributed training with gRPC framework • Google’s Optimized RPC for distributed network  RDMA Acceleration over UCX • Unified Communication X (UCX) • Integration with upstream TensorFlow  2X higher Performance with RDMA >2x Faster Lower is better ~2X Acceleration for TensorFlow with RDMA
  • 37. 37- Mellanox Confidential -© 2017 Mellanox Technologies 37 TensorFlow™ over RDMA in Apache® Spark™ Environment  Yahoo enhanced the TensorFlow C++ layer to enable RDMA over InfiniBand  InfiniBand provides faster connectivity and supports accelerated offload capability Source: http://yahoohadoop.tumblr.com/post/157196317141/open-sourcing-tensorflowonspark-distributed-deep InfiniBand Provides Near Linear Scalability for Inception Model Training
  • 38. 38- Mellanox Confidential -© 2017 Mellanox Technologies 38 2X Acceleration for Baidu  Machine Learning Software from Baidu • Usage: word prediction, translation, image processing  RDMA (GPUDirect) speeds training • Lowers latency, increases throughput • More cores for training • Even better results with optimized RDMA ~2X Acceleration for Paddle Training with RDMA
  • 39. 39- Mellanox Confidential -© 2017 Mellanox Technologies 39 ChainerMN Depends on InfiniBand  ChainerMN depends on MPI for inter-node communication  NVIDIA® NCCL library is then used for intra-node communication between GPUs  Leveraging InfiniBand results in near linear performance  Mellanox InfiniBand allows ChainerMN to achieve ~72% accuracy. Source: http://chainer.org/general/2017/02/08/Performance-of-Distributed-Deep-Learning-Using-ChainerMN.html
  • 40. 40- Mellanox Confidential -© 2017 Mellanox Technologies 40 Machine Learning Performance Comparison InfiniBand Delivers 60% Better Performance with 2X Less Infrastructure DeepBench measures the performance of basic operations involved in training deep neural networks. 60.3% 8 Accelerators 16 Accelerators 32 Accelerators Lower is Better
  • 41. 41- Mellanox Confidential -© 2017 Mellanox Technologies 41 Scalable Deep Learning Depends on Mellanox A Few Solution Examples
  • 42. 42- Mellanox Confidential -© 2017 Mellanox Technologies 42 HPE ProLiant XL270d accelerator tray HPE Apollo 6500 | Purpose Built for AI Mellanox ConnectX-4 and ConnectX-5 Options Available HPE Mellanox Adapter Options − 8 GP GPU @ 350W − 4:1 or 8:1 GPU:CPU − Latest network fabrics − 2U HPE Apollo d6500 chassis Delivers efficiency and tailor-made configurability
  • 43. 43- Mellanox Confidential -© 2017 Mellanox Technologies 43 NVIDIA® DGX-1™ Deep Learning Server Deep Learning Supercomputer in a Box 8 x NVIDIA® Tesla® P100 GPUs 5.3TFlops 16nm FinFET NVLINK 4 x ConnectX®-4 EDR 100G InfiniBand Adapters NVIDIA® “SaturnV” NVIDIA® Machine Learning Supercomputer #28 on the Top500 3.3Pf with 124 DGX-1 nodes #1 on the Green500
  • 44. 44- Mellanox Confidential -© 2017 Mellanox Technologies 44 NVIDIA® DGX-1™  World’s first purpose-built system for deep learning • SaturnV is #28 on the Top500, 3.3Pf with 124 nodes • SaturnV is also #1 on the Green500  Fully integrated hardware • 8x Tesla™ P100 (Pascal) w/16GB per GPU • 28672 CUDA® Cores • 4x ConnectX-4 EDR 100Gb/s HCAs  Fully integrated software stack • Major deep learning frameworks • Drivers, NVIDIA CUDA, NVIDIA Deep Learning SDK • GPUDirect™ RDMA
  • 45. 45- Mellanox Confidential -© 2017 Mellanox Technologies 45 Mellanox – Powering IBM Solutions IBM offers Mellanox InfiniBand and Ethernet Solutions Founding Members 32 Ports of 100GE 10G / 25G / 40G / 50G / 100G
  • 46. 46- Mellanox Confidential -© 2017 Mellanox Technologies 46 Big Sur – An Open Artificial Intelligence Platform (Facebook)  An OCP based, GPU Artificial Intelligence Platform  Flexible Architecture supporting up to 8 GPUs • NVIDIA®, Intel®, AMD  Mellanox Ethernet adapters • RDMA supported (RoCE)  Accelerating Artificial Intelligence • Text Processing • Language Modeling • Computer Vision https://code.facebook.com/posts/1687861518126048/facebook-to-open-source-ai-hardware-design/ Mellanox Intelligent Network for Intelligent Platform
  • 47. 47- Mellanox Confidential -© 2017 Mellanox Technologies 47 End-to-End Interconnect Solutions for All Platforms Highest Performance and Scalability for X86, Power, GPU, ARM and FPGA-based Compute and Storage Platforms X86 Open POWER GPU ARM FPGA Smart Interconnect to Unleash The Power of All Compute Architectures