Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
NVIDIA Keynote #GTC21
1.
2. NVIDIA
Accelerated Computing
Full Stack, 3 Chips, Data Center Scale
30 Million CUDA Downloads
150 SDKs
$100 Trillion Industry Served
Gaming
Data Science
Robotics
Broadcast
CAD
Physical
Sciences
Life
Sciences
Quantum
Physics
Digital
Twins
Genomics
5G
Quantum
Computing
Cybersecurity
AI
NLU
Machine
Learning
AI
Recsys
AI
Speech
AI
Computer
Vision
Medical
Imaging
Autonomous
Vehicles
EDA
3. COMPUTING FOR THE DA VINCIS AND EINSTEINS OF OUR TIME
8 of 10
World’s Top 500
Supercomputers
9 of 10
World’s Top 500 Green
Supercomputers
#1
MLPerf
Training and Inference
2021 Nobel Prize in Physics
Studying Earth’s Climate
25,000
Companies Run
on NVIDIA AI
Oak Ridge National Laboratory and the oak leaf symbol are registered trademarks of the U.S. Department of Energy. Use of this mark does not constitute or imply its
endorsement, recommendation, or favoring by the United States Government or any agency thereof or its contractors or subcontractors.
4. World Record Accuracy
2.96% Gap on
Gehring and Homberger
Scalable to 1,000s
of Locations
3 Seconds
vs
5 Minutes
to Route 1,000 Packages
ANNOUNCING
NVIDIA REOPT
Re-Optimize Logistics and Supply Chain in Real-Time
Accelerated Solver for Vehicle Route, Warehouse Picking,
Fleet-Mix Optimization
Massively Parallel Algorithm Generates Thousands of
Solution Candidates and Refinements
Dynamic Rerouting Reduces Travel Time – Save
Billions for a $10 Trillion Logistics Industry
Available Now
nvidia.com/reopt
5.
6. LEADING QUANTUM SIMULATORS
INDUSTRY PARTNERS
RESEARCH COMMUNITY PARTNERS
Oak Ridge National Laboratory and the oak leaf symbol are registered trademarks of the U.S. Department of Energy. Use of this mark does not constitute or imply its
endorsement, recommendation, or favoring by the United States Government or any agency thereof or its contractors or subcontractors.
ANNOUNCING
NVIDIA CUQUANTUM DGX
APPLIANCE
Research the Computer of Tomorrow on the Most
Powerful Computer Today
Appliance Available Q1 2022
cuQuantum Available Now for Download
developer.nvidia.com/cuquantum
Out-of-the-Box Optimized Stack for Cirq
Other Simulators in Development
cuQuantum SDK in Open Beta;
Accelerate Popular Quantum
Simulators from Google, IBM
cuQuantum
GOOGLE
7. ANNOUNCING
NVIDIA CUQUANTUM DGX
APPLIANCE
Research the Computer of Tomorrow on the Most
Powerful Computer Today
Out-of-the-Box Optimized Stack for Cirq
Other Simulators in Development
cuQuantum SDK in Open Beta;
Accelerate Popular Quantum
Simulators from Google, IBM
cuQuantum
Appliance Available Q1 2022
cuQuantum Available Now for Download
developer.nvidia.com/cuquantum
DGX cuQuantum Appliance
State Vector Simulator on Dual AMD CPU
Sycamore
Supremacy
Circuit
Quantum
Fourier
Transform
Shor's
Algorithm
29 Minutes
19 Seconds
8 Minutes
7 Seconds
22 Minutes
26 Seconds
8. ANNOUNCING
WORLD RECORD QUANTUM
SIMULATION OF MAXCUT
Record Qubit Scale
MaxCut Algorithm with cuQuantum Tensor
Network Simulation
1,688 Qubits on 896 GPUs
Advance Quantum Algorithm Research in
Drug Discovery, Climate Research, Cybersecurity,
and Finance
Tensor Network
Simulator on Theta
cuQuantum
on Selene
210
3,375 VERTICES
9. ANNOUNCING
NVIDIA CUNUMERIC
Accelerated Computing At-Scale for PyData
and NumPy Ecosystem
Python Used by 20 Million Data Scientists,
Researchers, and Scientists
NumPy Downloaded 122,000,000 Times Since 2017
NumPy Used by 790,000 Projects in GitHub
NumPy is the Foundation of Pandas, SciPy,
and Scikit-Learn
2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021
10. ANNOUNCING
NVIDIA CUNUMERIC
Accelerated Computing At-Scale for PyData
and NumPy Ecosystem
Transparently Accelerates and Scales
NumPy Workflows
Zero Code Changes
Automatic Parallelism and Acceleration for
Multi-GPU, Multi-Node Systems
Scales to 1,000s of GPUs
Available Now on GitHub and Conda
cuNumeric
NVIDIA Python Data Science and Machine Learning Ecosystem
cuDF
Pandas Scikit-Learn NetworkX NumPy
cuML cuGraph
11. DATA-CENTER-SCALE COMPUTE ENGINE
Implicitly Extract
Instruction-Level Parallelism
Overlap Memory Latency and
Computation
Manage Coherence of Memory
Hierarchy
Dynamic Scheduling
Retire
Reorder Buffer
Int Int FP FP L/S L/S
Instruction Buffer
In-Order
Out-Of-Order
In-Order
Branch Fetch
Instruction
Decode/Rename
Dispatch
22. SOFTWARE
DEFINED
CLOUD-NATIVE
DISAGGREGATED
COMPUTING
SCALE UP &
SCALE OUT
ZERO-TRUST
DOCA 1.0
Accelerated Secure Bare-Metal Cloud
Crypto Storage Acceleration
Software-Defined Networking
De/Compression Congestion Control
RegEx
NVIDIA DOCA 1.0
Accelerated Data Center Infrastructure
Offload, Accelerate, and Isolate Data Center
Infrastructure with Accelerated Networking, Security,
Storage, and Management Applications
NVIDIA BlueField DPU
23. 1400
DOCA Developers
108
New DOCA APIs
Deep Packet Inspection Intrusion Detection
Load Balancers
Telemetry Security Groups
Firewall
DOCA 1.2
Zero-Trust Security Framework
Service
Containers
Crypto Storage Acceleration
Software-Defined Networking
De/Compression Congestion Control
RegEx
ANNOUNCING
NVIDIA DOCA 1.2
Accelerated Data Center Infrastructure
NEW Zero-Trust Security Framework
Extend Threat Protection to Every Touch Point
Hardware & Software Authentication, Line-Rate Data
Encryption, Distributed Firewall, and Smart Telemetry
Security Groups and Virtual Private Cloud Isolation
NVIDIA BlueField DPU
24. Deep Packet Inspection Intrusion Detection
Load Balancers
Telemetry Security Groups
Firewall
DOCA 1.2
Zero-Trust Security Framework
Service
Containers
Crypto Storage Acceleration
Software-Defined Networking
De/Compression Congestion Control
RegEx
NVIDIA BlueField DPU
ANNOUNCING
CYBERSECURITY LEADERS EXTEND
ZERO-TRUST WITH NVIDIA
BLUEFIELD
Deploy Security-as-a-Service with DOCA 1.2
Extend Security from Perimeter to Edge
Security Processing on BlueField Offload CPU Burden
26. Anomaly Detection
Machine
Human
Machine
Machine
Machine
Machine
Human
Humans and Machines
Across the Enterprise
Post-Processing
Pre-Processing
Inference Requests
Apache Kafka
TRITON
Log Data
NVIDIA MORPHEUS
ANNOUNCING
NVIDIA MORPHEUS
Accelerated AI Platform for Next Gen SIEM
Built on NVIDIA RAPIDS and NVIDIA AI
600X Faster Data Processing – Monitor Every User and
Machine-Generated Data for Anomalous Behavior
Detect Anomalies with 10s of Millions of AI Models
in Real-Time
Pre-Trained Models for User Activity Fingerprinting and
Phishing Detection
Early Access 2 Available Now
nvidia.com/morpheus
29. Accelerated Computing Data Center Scale
AI
MILLION-X LEAP
1980 1990 2000 2010 2020
Single-threaded perf
1.5X per year
1.1X per year
102
103
104
105
106
107
109
108
101
MACHINE
LEARNING
SCALE
UP & OUT
ACCELERATED
COMPUTING
31. MILLION-X DRUG DISCOVERY
Refinement in Simulation Docking & Virtual Screening Physics-Based Simulation
Imaging & Crystallography
Exhaustive
Search
BINDING
FREE ENERGY
CHEMICAL COMPOUNDS
PROTEIN STRUCTURE OF
DISEASE TARGET
43. OMNIVERSE FOR DESIGN COLLABORATION
Designer #1’s World
Designer #3’s World
Internet
Physical World
Designer #2’s World
Shared World
Path Tracing
MDL
Physics
AI
44. OMNIVERSE FOR DIGITAL TWIN
Factory Designer
Robot Gym
Internet
Physical World
Robot Gym
Factory Digital Twin
Path Tracing
MDL
Physics
AI
45.
46. ANNOUNCING NEW OMNIVERSE FEATURES
SHOWROOM
Available in Beta
FARM
Available in Beta
AR
Available in Beta
VR
Coming Soon
48. ANNOUNCING
EARLY ACCESS BENTLEY ITWIN FOR
NVIDIA OMNIVERSE
Physically-Accurate 4D Visualization of Infrastructure
Digital Twins
Supports 4D Design Review and Construction Simulation
Early Access Now Available
bentley.com/4DVisualization
49. SIEMENS ENERGY BUILDS HRSG DIGITAL TWIN IN OMNIVERSE
Boiler Design
Flow Simulation
Internet
Physical World
Training Data
Plant Digital Twin
Path Tracing
MDL
Physics
AI
50.
51. BMW GROUP BUILDS FACTORY DIGITAL TWINS IN OMNIVERSE
Factory Planning
Robot Gym
Internet
Physical World
Digital Human Training
Factory Digital Twin
Path Tracing
MDL
Physics
AI
52.
53. ERICSSON BUILDS CITY DIGITAL TWIN IN OMNIVERSE
City Planning
Simulation
Internet
Physical World
Materials
5G Network Digital Twin
Path Tracing
MDL
Physics
AI
56. GRAPH NEURAL NETWORKS CAPTURE INSIGHTS FROM INTERCONNECTED DATA
90B Relationships in a Social Network
1.1B Transactions a Day
40B Molecules in Chemical Databases
DRUG DISCOVERY SOCIAL CONNECTIONS | FRAUD DETECTION
57. ANNOUNCING
DGL ACCELERATION WITH CUDA-X
GPU-Accelerated GNN Workflow
GNN for Molecule Reaction Prediction,
Node Classification, Knowledge Graphs,
and Model Explainability
CUDA-Optimized Reference Examples for
SE3-Transformer, R-GCN, and GraphSage
Early Access in December
ngc.nvidia.com
cuDF cuGraph DGL with CUDA-X
Text
Images
Relationships
Sub-Graph
Construction Graph to GNN
Graph
Construction
58. PROCESSING THE LARGEST GRAPH NEURAL NETWORKS
300B pins in graphs with billions of nodes
and billions of edges
5X faster training on graphs with over
10M nodes and over 100M edges from 24hrs to 5hrs
Improving fraud detection over
billions of transactions
60. ANNOUNCING NVIDIA NEMO MEGATRON
Megatron 530B
NEMO MEGATRON
Automated
Data Curation
Distributed
Training
Customer’s Data
Custom
Megatron 530B
61. ANNOUNCING NVIDIA TRITON MULTI-GPU MULTI-NODE INFERENCE
Triton Inference Server
Magnum IO
(Multi-GPU, Multi-Node)
Query Response
Application
Response Time
1/2 Second
2 DGX A100
Response Time
> 1 Minute
Dual Socket CPU Server
62. JAPANESE
Global Ecommerce
ENTERPRISE DIGITAL
WORKFLOWS
SWEDISH
Finance, Healthcare,
and Manufacturing
PORTUGUESE
AI Assistant
CHINESE
Customer Service
VIETNAMESE
Radiologists and
Telehealth AI Service
INTENT RECOGNITION,
TRANSLATION, AND CHAT
KOREAN
Chatbots and Call Centers
LARGE LANGUAGE MODELS TAKING HPC MAINSTREAM
63. THANK YOU TO THOSE PRESENTING AT GTC!
RETAIL CLOUD ENTERPRISE SaaS
TELECOMMUNICATIONS
INDUSTRIAL HEALTHCARE CONSUMER INTERNET
AUTO
64. LEADING COMPANIES RUNNING NVIDIA AI
RETAIL CLOUD ENTERPRISE SaaS
TELECOMMUNICATIONS
INDUSTRIAL HEALTHCARE CONSUMER INTERNET
AUTO
65. ANNOUNCING
MICROSOFT TEAMS ADOPTS
NVIDIA AI WITH AZURE
COGNITIVE SERVICES
Nearly 250 Million Monthly Active Microsoft
Teams Users
Live Transcription and Captions in 28 Languages
Triton Inference Server in Azure Cognitive Services
66. AI INFERENCE IS HARD
PROCESSORS
AI Inference
DEPLOYMENT
PLATFORMS
Cloud On-Prem Edge Embedded
T4 GPU Arm CPU
A100 GPU
V100 GPU x86 CPU
FRAMEWORKS APP CONSTRAINTS
Real Time Batch Streaming
MODELS
CNN
GNN Decision Trees
RNN Transformers
67. ANNOUNCING
TENSORRT INTEGRATED WITH
PYTORCH AND TENSORFLOW
Accelerate In-Framework Inference with TensorRT with
Just 1 Line of Code
3X Faster
Supports Every Workload
FP32, TF32, FP16, INT8
Available for Download Today
ngc.nvidia.com
68. ANNOUNCING NVIDIA TRITON
WITH FOREST INFERENCING
ML and DL in One Application
Tree Models (XGBoost, Random Forest, LightGBM) Are
Ubiquitous
Large Tree Ensembles Push CPUs Beyond
Response-Time Limits
Ensembling with ML, DL, and More Complex Models is the
Future with Fraud Detection
0%
3.5 ms 0 ms
1.5 ms
Detection Rate
Max Response Time
80%
Transaction > $50?
Time after midnight?
XGBOOST MODEL
69. ANNOUNCING NVIDIA TRITON
WITH FOREST INFERENCING
ML and DL in One Application
Tree Models (XGBoost, Random Forest, LightGBM) Are
Ubiquitous
Large Tree Ensembles Push CPUs Beyond
Response-Time Limits
Ensembling with ML, DL, and More Complex Models is the
Future with Fraud Detection
7K
Transaction > $50?
Time after midnight?
XGBOOST MODEL
7K
7K 7K
GOAL
Increase Detection Rate Within 1.5ms
3.5 ms 0 ms
80%
0%
1.5 ms
Detection Rate
Max Response Time
70. ANNOUNCING NVIDIA TRITON
WITH FOREST INFERENCING
ML and DL in One Application
Unified Deployment Engine for DL and ML
Inference Random Forests, GBDTs, and
Decision Trees
Deploy on CPU and GPU
Process Million+ Node Tree Models with Low-Latency
Fraud Detection, Recommender Systems, Risk Assessment,
and Predictive Maintenance
Available for Download Today
ngc.nvidia.com
UNACCEPTABLE
RESPONSE TIME
0 ms
3.5 ms
1M
Nodes
7K
GIANT XGBOOST MODEL
7K 7K
7K
Detection Rate
80%
0%
Max Response Time
1.5 ms
1M
Nodes
71. ANNOUNCING
TRITON INFERENCE SERVER 2.15
NEW Integration into AWS SageMaker and AliCloud –
Now All Major Frameworks, Major Clouds, and
AI Platforms
NEW Support for Arm – Now Inference on Every
Generation of GPUs, x86 CPUs, and Arm
NEW Model Analyzer Optimizes for App
QoS Requirements
NEW Forest Inference | NEW Distributed Multi-GPU,
Multi-Node Inference
Optimal
Model Config
TensorFlow PyTorch
TensorRT
RAPIDS
OpenVINO ONNX RT
Triton Inference Server
Application
Ampere x86 CPU
Volta Arm CPU
Turing
QoS Requirements
Triton
Model Analyzer
72. BOOST THROUGHPUT OF MODERN DATA CENTERS WITH NVIDIA TRITON
TensorRT 1 TensorRT 4 TensorRT 8.2
ACCELERATES EVERY WORKLOAD WORLD-CLASS RESPONSE TIME AND THROUGHPUT
12X
Recommenders
< 1 sec
10X
Reinforcement
Learning
583X
Speech Recognition
< 100ms
36X
Computer Vision
< 7ms
178X
Text-to-Speech
< 100ms
21X
NLP
< 50ms
CLASSIFICATION CLASSIFICATION
DETECTION
SEGMENTATION
RECOMMENDERS
CLASSIFICATION
DETECTION
SEGMENTATION
RECOMMENDERS
RL
FRAUD DETECTION
TEXT-TO-SPEECH
SPEECH RECOGNITION
SENTIMENT ANALYSIS
TRANSLATION
SENTENCE COMPLETION
Q & A
73. Software-Defined Real-Time Secure Hybrid Cloud
Deploy &
Orchestrate Fleet
High-Speed IO
Networking
Storage
Data
Processing
Signal &
Image
Processing
DNN
PINN
Graphics Streaming
RETAIL HEALTHCARE MANUFACTURING
EDGE AI AUTOMATING EVERY INDUSTRY
15M Stores
(CV, SpeechAI, NLU, RecSys)
160K Hospitals
(CV, SpeechAI, NLU, Robotics)
10M Factories
(CV, SpeechAI, NLU, Robotics)
FAST FOOD
7M Restaurants
(CV, SpeechAI, NLP)
75. Fleet Command
Aerial 5G
NVIDIA AI
NVIDIA Metropolis
With Unified Computing Framework
NVIDIA METROPOLIS AI EDGE
3rd Party
L2+
3rd Party
5G CORE
NVIDIA
L1
DATA CENTER COMMAND CENTER
Video In
DeepStream
on EGX
Triton
on EGX
Triton
on EGX
DeepStream
on EGX
Metadata
Display
DeepStream
on EGX
DeepStream
on EGX
MEDIA
HANDELING
DETECTION CLASSIFICATION SMOOTHENING
VISUALIZATION
TRACKING
76. Fleet Command
Aerial 5G
NVIDIA AI
NVIDIA Metropolis
With Unified Computing Framework
MAVENIR
L2+
MAVENIR
5G Core
NVIDIA
L1
Powered by NVIDIA Metropolis AI-on-5G Edge Platform
ANNOUNCING MAVENIR MAVEDGE-AI
Mavenir MAVedge-AI Application
DATA CENTER COMMAND CENTER
Video In
DeepStream
on EGX
Triton
on EGX
Triton
on EGX
DeepStream
on EGX
Metadata
Display
DeepStream
on EGX
DeepStream
on EGX
MEDIA
HANDELING
DETECTION CLASSIFICATION SMOOTHENING
VISUALIZATION
TRACKING
85. PROJECT MAXINE WITH OMNIVERSE AVATAR
RIVA
MEGATRON 530B
MERLIN
DIALOG MANAGER
NV AVATAR
NV CV
Camera In
Mic In/Out
Graphics /
Video Out
86. MEGATRON 530B
MERLIN
DIALOG MANAGER
RIVA
Mic In/Out
ANNOUNCING
NVIDIA RIVA SPEECH AI
World Class Quality and Response Time
SDK to Customize for Use Case and Unique Voice for
Brand Virtual Assistant
Train New Voice with Only 30 Mins of Speech Data
Human-Like Expressivity and Fine-Grained Control
Deploy in Cloud, On-Prem, Edge, and Embedded
Enterprise Support Available Q1 ‘22 Globally
developer.nvidia.com/riva
89. PROJECT TOKKIO WITH OMNIVERSE AVATAR
Video Out
Video In
Audio In
Audio Out
Megatron 530B
Triton MGMN
on DGX
ASR
Triton on EGX
DL FACE
TRACKER
DeepStream
on EGX
AUD2FACE
OMNIVERSE
Zero-Shot DM
Triton on EGX
Merlin
Triton on EGX
TTS
Triton on EGX
RIVA
AVATAR
SIMULATION
OMNIVERSE
90.
91. PROJECT MAXINE WITH OMNIVERSE AVATAR
Video Out
Video In
Audio In
Audio Out
with Translation
AUDIO
DENOISE
Triton on EGX
GAZE REPOSE
DeepStream
on EGX
VID2FACE
Triton on EGX
RIVA
SPEECH AI
Triton on EGX
AUD2FACE
OMNIVERSE
FACE TRACKER
DeepStream
on EGX
POSE & MESH
Triton on EGX
94. STRYKER AIRO TruCT
Interoperative CT
JOHNSON & JOHNSON AURIS
Robotic Endoscopy
MEDTRONIC HUGO
Robotic-Assisted Surgery
INTUITIVE SURGICAL ION
Robotic-Assisted Lung Biopsy
2M Devices | 16K Companies | 10K Modalities
MEDICAL INSTRUMENTS INTEGRATE AI AND ROBOTICS
95. RENDERING
OV on RTX
DATA
PROCESSING
cuCIM on EGX
ZERO-SHOT NLU
DIALOG
MANAGER
Triton on EGX
RIVA ASR
Triton on EGX
Audio
Sensor
SENSOR
PROCESSING
CUDA on EGX
IMAGE
PROCESSING
Triton on EGX
STREAM
DISPLAY
CloudXR on RTX
PHYSICS
PROCESSING
CUDA on EGX
ANNOUNCING
NVIDIA CLARA HOLOSCAN
AI COMPUTING INSTRUMENTS
PLATFORM
Stream-Computing Platform for High-Throughput
Signal, Data, AI, and Graphics Processing
Run in Data Center, Embedded Instruments,
or Hybrid
Remotely Update, Orchestrate, and Monitor Fleet
Platform for SaaS Model
Available November 15
developer.nvidia.com/clara-holoscan-sdk
96. ANNOUNCING NVIDIA AGX ORIN
Computational Sensing Instrument Platform
NVIDIA A6000
NVIDIA ConnectX-7
NVIDIA Orin
97. ANNOUNCING NVIDIA AGX ORIN
Computational Sensing Instrument Platform
NVIDIA A6000
NVIDIA ConnectX-7
NVIDIA Orin
101. ANNOUNCING
NVIDIA ISAAC ROS
NEW Isaac ROS GEM Brings NVIDIA Robotics AI
to ROS Community
Accelerate ROS-Native Packages up to 10X
Isaac Sim Out-of-Box ROS Support
NEW Isaac Sim Replicator for Synthetic Data Generation
Isaac Sim Replicator
Omniverse Farm
TAO
ISAAC GEMS
AUTO-LABELED SYNTHETIC DATA
ISAAC SIM
ISAAC ROS APPLICATION
STEREO DEPTH
SGM
SEGMENTATION
DNN
HUMAN POSE
ESTIMATION
Download Isaac ROS GEMS
developer.nvidia.com/isaac-ros-gems
102. ANNOUNCING
OMNIVERSE REPLICATOR FOR ISAAC
SIM
NEW Isaac ROS GEM Brings NVIDIA Robotics AI
to ROS Community
Accelerate ROS-Native Packages up to 10X
Isaac Sim Out-of-Box ROS Support
NEW Isaac Sim Replicator for Synthetic Data Generation
106. Functionally Safe Production Ready AV Platform
ANNOUNCING DRIVE HYPERION 8 GA
Software Defined Vehicle
Dual Orin X Standard Form Factor
Production Sensor Set
DriveWorks Acceleration Libraries
DRIVE AV Software
Tools for OEM Adaptation
Available Now
nvidia.com/drive-hyperion
2 x NVIDIA DRIVE Orin
Systems-on-a-Chip (SoCs)
113. NVIDIA
Accelerated Computing
Full Stack, 3 Chips, Data Center Scale
30 Million CUDA Downloads
150 SDKs
$100 Trillion Industry Served
Nemo
Megatron
Triton
ReOpt
cuQuantum
cuNumeric
Morpheus
Metropolis
Clara
Holoscan
Isaac
Maxine DRIVE
RIVA
DGL Modulus Omniverse
Launchpad AGX Orin Hyperion 8
Quantum-2