SlideShare a Scribd company logo
1 of 43
Download to read offline
The Impact of AI on HPC … “Everywhere”
Rob Farber
CEO TechEnablement.com
Contact info AT techenablement.com for consulting, teaching, writing, and other inquiries
GPUs - From games to supercomputers to Deep
Learning. CPUs – General purpose to DL devices
GPUs evolved from pushing pixels CPUs evolved from
running applications
• Failure of Dennard’s scaling laws
caused switch to multicore
• Parallelism via GPUs and Intel Xeon
Phi is now the norm
• CPUs are now competitive!
ZZZ
CUDA
OpenCL
OpenCL
100 million+ GPUs
<mumble>
Larrabee zZz
1/2 Billion+ GPUs
Intel Xeon
Scalable
KNL, KNM,
…
CPUs are also good for AI
• Don’t get locked into thinking “have to have a GPU”
Jump on the AI/Machine learning bandwagon
1940 1950 1960 1970 1980 1990 2000 2010 2020
Electronic Brain
(1943)
Perceptron
(1957)
XOR
(1969)
XOR
Multilayer
Perceptron
Backprop
(1986)
“Computational Universality” An XOR ANN
• The example of XOR nicely emphasizes the
importance of hidden neurons:
• They re-represent the input such that the
problem becomes linearly separable.
• Networks with hidden units can implement any
Boolean function -> Computational Universal
devices!
• Networks without hidden units cannot learn XOR
• Cannot represent large classes of problems
G(x)
NetTalk
Sejnowski, T. J. and Rosenberg, C. R. (1986) NETtalk: a parallel
network that learns to read aloud, Cognitive Science, 14, 179-211
http://en.wikipedia.org/wiki/NETtalk_(artificial_neural_network)
500 learning loops Finished
"Applications of Neural
Net and Other Machine
Learning Algorithms to
DNA Sequence Analysis",
(1989).
Big hardware idea 1: SIMD
High-performance from the past
• Space and power efficient
• Long life via a simple model
6
16,384 GPU results 2014
courtesy ORNL
Farber: general SIMD mapping :
“Most efficient implementation to date”
(Singer 1990), (Thearling 1995)
The Connection Machine
Turing TU102 (Training)
$30M
$1-3K
2013 Results courtesy Intel and NVIDIA
1 TF/s 13 PF/s
A general SIMD mapping:
Optimize(LMS_Error = objFunc(p1, p2, … pn))
7
Examples
0, N-1
Examples
N, 2N-1
Examples
2N, 3N-1
Examples
3N, 4N-1
Step 2
Calculate partials
Step1
Broadcast
parameters
Optimization Method
(Powell, Conjugate Gradient, Other)
Step 3
Sum partials to get
energy
GPU 1 GPU 2 GPU 3
p1,p2, … pn p1,p2, … pn p1,p2, … pn p1,p2, … pn
GPU 4
Host
See a path to exascale
(MPI can map to thousands of GPU or Processor nodes)
8
Always report “Honest Flops”
Application to Bioinformatics
Internal
connections
The phoneme to
be pronounced
NetTalk
Sejnowski, T. J. and Rosenberg, C. R. (1986)
NETtalk: a parallel network that learns to read
aloud, Cognitive Science, 14, 179-211
http://en.wikipedia.org/wiki/NETtalk_(artificial_neural
_network)
Internal
connections
t te X A TC G T
"Applications of Neural Net and Other Machine
Learning Algorithms to DNA Sequence Analysis",
A.S. Lapedes, C. Barnes, C. Burks, R.M. Farber, K.
Sirotkin, Computers and DNA, SFI Studies in the
Sciences of Complexity, vol. VII, Eds. G. Bell and
T. Marr, Addison-Wesley, (1989).
T|F Exon region
Predicting binding affinity
(The closer you look the greater the complexity)
Electron Microscope
The question for computational biology
How do we know you
are not playing
expensive computer
games with our money?
Utilize a blind test
(Example reference when starting my drug discovery
company)
Internal
connections
A0
Binding
affinity for a
specific
antibody
A1 A2 A3 A4 A5
Possible hexamers
206 = 64M
1k – 2k pseudo-random
(hexamer, binding)
affinity pairs
Approx. 0.001%
sampling
“Learning Affinity Landscapes: Prediction of Novel Peptides”, Alan
Lapedes and Robert Farber, Los Alamos National Laboratory
Technical Report LA-UR-94-4391 (1994).
Hill climbing to find high affinity
Internal
connections
A0
𝐴𝑓𝑓𝑖𝑛𝑖𝑡𝑦 𝐴𝑛𝑡𝑖𝑏𝑜𝑑𝑦
A1 A2 A3 A4 A5
Learn:
𝐴𝑓𝑓𝑖𝑛𝑖𝑡𝑦 𝐴𝑛𝑡𝑖𝑏𝑜𝑑𝑦
= 𝑓 𝐴0, … , 𝐴5
𝑓(F,F,F,F,F,F)
𝑓(F,F,F,F,F,L)
𝑓(F,F,F,F,F,V) 𝑓(F,F,F,F,L,L)
𝑓(P,C,T,N,S,L)
Predict P,C,T,N,S,L has the
highest binding affinity
Confirm
experimentally
Time series
Iterate
𝑋𝑡+1 = 𝑓 𝑋𝑡, 𝑋𝑡−1, 𝑋𝑡−2, …
𝑋𝑡+2 = 𝑓 𝑋𝑡+1, 𝑋𝑡, 𝑋𝑡−1, …
𝑋𝑡+3 = 𝑓 𝑋𝑡+2, 𝑋𝑡+1, 𝑋𝑡, …
𝑋𝑡+4 = 𝑓 𝑋𝑡+3, 𝑋𝑡+2, 𝑋𝑡+1, …
Internal
connections
Xt Xt-1
Learn:
𝑋𝑡+1 = 𝑓 𝑋𝑡, 𝑋𝑡−1, 𝑋𝑡−2, …
Xt-2 Xt-3 Xt-4 Xt-5
Xt+1
Works great! (better than other
methods at that time)
"How Neural Nets Work", A.S. Lapedes, R.M. Farber,
reprinted in Evolution, Learning, Cognition, and Advanced
Architectures, World Scientific Publishing. Co., (1987).
Designing ANNs for Integration and
Bifurcation analysis – “training a netlet”
"Identification of Continuous-Time Dynamical Systems:
Neural Network Based Algorithms and Parallel
Implementation", R. M. Farber, A. S. Lapedes, R.
Rico-Martinez and I. G. Kevrekidis, Proceedings of the 6th
SIAM Conference on Parallel Processing for Scientific
Computing, Norfolk, Virginia, March 1993.
ANN schematic for continuous-time
identification. (a) A four-layered ANN based on
a fourth order Runge-Kutta integrator. (b) ANN
embedded in a simple implicit integrator.
(a) Periodic attractor of the Van der Pol oscillator for g= 1.0,
d= 4.0 and w = 1.0. The unstable steady state in the interior
of the curve is marked +. (b) ANN-based predictions for the
attractors of the Van der Pol oscillator shown in (a).
Dimension reduction
• The curse of dimensionality
• People cannot visualize data beyond 3D + color
• Search volume rapidly increases with dimension
• Queries return too much data or no data
I I I I I
B
I I I I I
B
I I I I I
B
I I I I I
B
I I I I I
B
I I I I I
B
I I I I I
B
I I I I I
B
I I I I I
B
I I I I I
B
I I I I I
B
I I I I I
B
I I I I I
B
I I I I I
B
I I I I I
B
Sensor 1 Sensor 2 Sensor 3 Sensor N
Sensor 1 Sensor 2 Sensor 3 Sensor N
Sensor 1 Sensor 2 Sensor 3 Sensor N
Sensor 1 Sensor 2 Sensor 3 Sensor N
Sensor 1 Sensor 2 Sensor 3 Sensor N
X Y Z
Jump on the AI/Machine learning bandwagon
1940 1950 1960 1970 1980 1990 2000 2010 2020
Electronic Brain
(1943)
Perceptron
(1957)
XOR
(1969)
XOR
Multilayer
Perceptron
Backprop
(1986)
(2007)
Deep Neural
Network
(2006)
AGAIN!
Wildly
Overpromised
All the previous work is now orders of magnitude faster!
• Time series
• Bifurcation analysis
• Locally Weighted Linear Regression (LWLR)
• Neural Networks
• Naive Bayes (NB)
• Gaussian Discriminative Analysis (GDA)
• k-means
• Logistic Regression (LR)
• Independent Component Analysis (ICA)
• Expectation Maximization (EM)
• Support Vector Machine (SVM)
• Others: (MDS, Ordinal MDS, etcetera)
People are “rediscovering” ANNs in an Exascale
framework
This is not an AI history talk – Deep learning is
commercially viable encompassing much of what we do
Speech recognition in noisy
environments (Siri, Cortana,
Google, Baidu, …)
Better than human
face recognition
Self-driving cars
• Internet Search • Robotics • Self guiding drones • Much, much, more
Plus everything discussed previously: time series, system modeling, drug
discover, and more
A change in the scientific mindset: ANNs in
physical models
• Previously: ANNs are black boxes – how can
you guarantee they won’t add non-physical
artifacts to my model?
• Today: CERN has shown that GANs
(Generative Adversarial Networks) can learn
physics based distributions that cannot be
distinguished from real distributions
Longitudinal
shower shape for
400 GeV electron
Longitudinal
shower shape for
100 GeV electron
It does not matter which
generator one uses so long as
the generated distributions
are indistinguishable from
the real data distributions –
Sophia Vallecorsa, CERN
physist Openlab
Time to generate an electron shower: Full Monte Carlo 17000.00 msec
GAN running on GTX 1080 00000.04 msec
ANNs evaluate simulation results
• Deep learning ANNs recognize storm types in faster-than-realtime
climate models.
• Too much data for a human
• Validation via historical data
• Evaluate storm distributions in the future predictions
Expect significant algorithm and HW retooling
Today: General-purpose devices like CPUs and GPUs
Tomorrow: FPGA and ASIC Data Flow devices
Near term: Neuromorphic will change everything
Far term: Quantum: Seems to be ideal for training (once we have the qubits
CPU GPU
FPGA Chips
Roughly four different camps
Ubiquitous Low precision AI hardware is
changing HPC
• NLAFET: Reduced precision BLAS
• Synchronization-reducing algorithms
• Communication-reducing algorithms
• Mixed precision methods (2x speep of ops and 2x speed for data movement)
• Autotuning
• Fault resilient algorithms
• Reproducibility of results
• LIBXSMM: batched reduced precision small matrix operations => very high percentage of peak!
Scientists reexamining precision
• The HiCMA library: million by million matrix operations in gigabytes of
workstation memory
DAGs are also changing HPC
• NLAFET uses DAG scheduling to support CPUs, GPUs, and Data Flow
architectures
From HPC centers to IoT edge devices
HPC centers will soon be part of a data network, not
the center of the HPC effort.
• Read the BDEC report (Big Data and Extreme-scale
Computing)
• http://www.exascale.org/bdec/sites/www.exascale.org.bdec/files/whitepapers/bdec_pathways.pdf.
• Look at Frontera at TACC – an “agile” HPC supercomputer!
• Largest academic supercomputer at a University
• Leverages the older Wrangler effort data analytics system
• The Argonne lab-wide data service based on the Petrel
service.
• More …
Not bad for surface fitting – but what’s
next?
• Lapedes and Farber, “How Neural Networks
Work,” showed that during training the neural
network is actually fitting a ”bumpy” multi-
dimensional surface.
• Inference:
– Interpolates between points on the surface, or
– Extrapolates out from points on the surface.
Birthday card:
“Herpes Bathtub!”
Inside the card:
“Happy Birthday!
(Damn voice recognition)”
Neuromorphic – the next step!
• “[We] demonstrate that neuromorphic computing … can
implement deep convolution networks that approach
state-of-the-art classification accuracy across eight
standard datasets encompassing vision and speech,
perform inference while preserving the hardware’s
underlying energy-efficiency and high throughput.”
Accuracy of different sized networks
running on one or more TrueNorth chips to
perform inference on eight datasets. For
comparison, accuracy of state-of-the-art
unconstrained approaches are shown as
bold horizontal lines (hardware resources
used for these networks are not indicated).
Build big and very, very power efficient networks
(e.g. “brains”)
A system roughly the neuronal size of a rat brain
PNAS (8/9/16) Convolutional networks for fast, energy-efficient neuromorphic computing, Steven K. Essera et. al.
• Use integrate-and-fire spiking neurons
• Data sets run between 1,200 and 2,600 frames/s and using between 25 and 275 mW
• (effectively >6,000 frames/s per watt)
DATA
It’s easy “From “Hello World” to exascale”
double objFunc( ... )
{
double err=0.;
#pragma acc parallel loop reduction(+:err)
#pragma omp parallel for reduction(+ : err)
{
err = 0.;
for(int i=0; i < nExamples; i++) {
// transform
float d=myFunc(i, param, example, nExamples, NULL);
//reduce
err += d*d;
}
}
return sqrt(err);
}
int main()
{
cout << "Hello World" << endl;
// load data and initialize parameters
init();
#pragma acc data 
copyin(param[0:N_PARAM-1]) 
pcopyin(example[0:nExamples*EXAMPLE_SIZE-1])
{
optimize( objFunc ); // the optimizer calls the objective function
}
return 0;
}
Usage Taxonomy
Organizing HPC + AI Convergence
Operation
HPC + AI couple simulation with live data in real
time detection/control system
Experimental/simulated data is used to
train a NN, where resulting inference
enging is used to for real-time
detection/control of an experiment or
clinical delivery system
The NN is improved as new simulated /
live data is acquired
Experimental/simulated data is used to
train a NN that is used to replace all or
significant runtime portions of a
conventional simulation
The NN is improved continuously as
new simulated / live data is acquired
Experimental/simulated data used to
train a NN which steers
simulation/experiment btwn runs
The steering NN can be trained
continuously as new simulated / live data
is acquired
Augmentation
HPC + AI combined to improve simulation time
to science > orders of magnitude
Modulation
HPC + AI combined to reduce the number of
runs needed for a parameter sweep
Potential for Breakthroughs in Scientific Insight
So much more, you have been great, Thank You!
Rob Farber
CEO TechEnablement.com
Contact info AT techenablement.com for consulting, teaching, writing, and other inquiries
Numerous use cases I don’t have time
to present
Background
The aLIGO (Advanced Laser Interferometer Gravitational Wave
Observatory) experiment successfully discovered signals
proving Einstein’s theory of General Relativity and the existence
of cosmic Gravitational Waves. While this discovery was by
itself extraordinary it is a step to the ultimate goal to combine
multiple observational data sources that not only hear but also
see to the complete spectrum of data in real time.
Challenge
The initial a LIGO discoveries were successfully completed using
classic data analytics. The processing pipeline used hundreds of
CPU’s where the bulk of the detection processing was done
offline. The latency is far outside the range needed to activate
resources, such as optical, infrared or radio telescopes which
observe phenomena in the electromagnetic spectrum in time
to “see” what aLIGO can “hear”.
Solution
A DNN was developed and trained with simulated data and
verified using from the CACTUS/Einstein Toolkit. The DNN was
shown to produce better accuracy with latencies 4500x faster
than the original CPU based pattern matching waveform
detection.
Impact
Faster and more accurate detection of gravitational waves with
the potential to steer other observational data sources.
“HEARING” GRAVITY
IN REAL TIME
Background
The “Grand Challenge” of fusion energy would offer the humankind changing
opportunity to provide clean, safe energy for millions of years.
ITER is a $25B international experiment to develop the prototype to demonstrate
commercially viable fusion reactor.
Challenge
The plasma in a fusion reactor is highly turbulent at the edges of the flow, and
disruptions can occur that break the magnetic confinement, which can cause damage
to the physical reactor
It is critical to predict when a disruption will occur to prevent damage and maintain
safe operation.
Traditional simulation and ML approaches were 65% to 85% accurate with 5% false
alarm rate
Solution
DL network called FRNN using Theano exceeds today's best accuracy results. It scales
to 200 Tesla K20s, and with more GPUs, can deliver higher accuracy. Current level of
accuracy is 95% prediction with 5% false alarm rate.
Impact
Vision is to operate ITER with FRNN, operating and steering experiments in real-time
to minimize damage and down-time.
CONVERGED HPC SPEEDING
THE PATH TO FUSION ENERGY
UNDERSTANDING THE
STANDARD MODEL
The simulation for the LHC particle collider is known as GEANT and it is
numerically intensive, requires significant compute cycles as part of the
scientific workflow and is extremely hard to optimize for modern CPU or
GPU type architectures. The High Luminosity LHC is expected to
generate 10X the volume of data, and the compute load on GEANT is in
turn expected to challenge future computing systems within a flat
budget profile.
A GAN (caloGAN) was constructed by a team at Lawrence Berkeley Lab,
CERN and Yale that was customized for Calorimiter Experiments similar
to ATLAS in the LHC. A training data set was developed using 100,000
GEANT events as input. 50 epochs were then used on 18xK80’s to train
the GAN with KERAS and TensorFlow.
Impact
The CaloGAN is more accurate and showed up to 5 Orders of Magnitude
performance speed up relative to the original simulation on a single K80
GPU
Background
It takes 14 years and $2.5 Billon to develop 1 drug
Higher than 99.5% failure rate after the drug discovery phase
Challenge
QC simulation is computationally expensive - it takes 5 years to
compute on CPUs
So researchers use approximations, compromising on accuracy. To
screen 10M drug candidates,.
Solution
Researchers at the University of Florida and the University of North
Carolina leveraged GPU deep learning to develop a custom framework
ANAKIN-ME, to reproduce molecular energy surfaces with super speed
(microseconds versus several minutes), extremely high (DFT) accuracy,
and at up to 6 orders of magnitude improvement in speed.
Impact
Speed and accuracy could start a revolution in computational
chemistry — and forever change the way we discover the medicines of
the future
CONVERGED HPC
REVOLUTIONIZING DRUG DISCOVERY
Converged HPC
Accelerates Drug
Discovery
The “drug discovery” phase of the development process
involves exploring all the different possible combinations of
protein molecules (targets) and drug chemical compounds to
ensure the drug will do what it’s designed to do. Classic
Molecular Dynamics simulations are very time-consuming and
expensive. Machine Learning models have been designed to
help predict probability of the target molecules interacting with
the drug chemical compounds, but still require significantly
more performance to deliver improved accuracy.
Researchers developed and trained a convolutional neural
network accelerated with NVIDIA GPU’s to improve the model
performance and prediction accuracy. Ultimately, they
improved prediction accuracy from approximately 52% to 70%
compared to other machine learning-based models (Vina
Docking). (35% relative improvement)
HUNTING “GHOST PARTICLES” WITH
DEEP LEARNING
Background
The NoVA experiment managed by Fermi lab comprises 200 scientists at
40 institutions in 7 countries. The goal is to track neutrino’s, which are
often referred to as the “Ghost Particle”, and detect oscillation which is
used to better understand how this super abundant, and elusive particle
interacts with matter.
Challenge
The experiment is built underground and is comprised of a main injector
beam and two large detector apparatus located 50 miles apart. The near
detector is 215 Tons and the Far detector is 15,000 Tons. The
experiment can be thought of as a 30 Mn pound detector that takes 2 Mn
pictures per second.
The detectability of the current experiment is proportional to the size of
the detectors, so increasing the “visibility” is complex and costly.
Solution
A DNN was developed and trained using a data set derived from multiple
HPC simulations including GENIE and GEANT using 2 K40 GPU’s. the CVN
was basd on convolutional neural networks used for image processing
Impact
The result was an overall improvement of 33% in detector mass, where
the optimized CVN signal-detection-optimized efficiency of 49% is a
significant gain over the efficiency of 35% quoted in prior art. This would
yield a net effect of a10Mn pound increase the physical detector mass
AN AI MONITOR OF
EARTH’S VITALS
The Earth’s climate has changed throughout
history, but in recent years there have been record
increases in temperature, glacial retreat and rising
sea levels. NASA Ames is using satellite imagery to
measure the effects of carbon and greenhouse gas
emissions on the planet. To do so, they developed
DeepSat — a deep learning framework for satellite
image classification trained on a GPU-powered
supercomputer. The enhanced satellite imagery
will help scientists plan to protect ecosystems and
farmers improve crop production.
AI Sheds Light on
Key Mysteries of
the Universe
Gravitational Lensing generates an image of a far
away galaxy that is distorted into rings and arcs by
the gravity of a massive object, such as galaxy
cluster. The distortions provide important clues
about how mass is distributed in space and how that
distribution changes over time, to measure
properties of dark matter, size of galaxies, and the
expansion of the universe. Today there are roughly
200 known gravitational lenses. However, scientist
expect to uncover 200,000 gravitational lenses in
the next decade. Using traditional approaches, it
can take anywhere from 2 days to 3 months to
analyze a single lens image.
SLAC scientists have for the first time used artificial
neural networks to analyze gravitational lenses in
just 10ms. This will allow researchers to unlock
many of the unsolved mysteries of the universe.
AI Accelerates the
Production of
Ultra Cold Gases
Bose-Einstein Condensate (BEC) is a state of matter
formed by cooling a gas to near-zero absolute
temperature. BEC is achieved by controlling the intensity
of the lasers to trap only the ultra-cold atoms and allowing
other atoms to escape. BECs are super sensitive to
external disturbances. This makes them suitable for very
precise measurements of things like tiny changes in
Earth’s magnetic field or gravity.
Researchers at the University of North South Whales
used AI to create a BEC gas 14 times faster than
conventional methods.
Background
Unexpected fog can cause an airport to cancel or delay flights, sometimes having
global effects in flight planning.
Challenge
While the weather forecasting model at MeteoSwiss work at a 2km x 2km
resolution, runways at Zurich airport is less than 2km. So human forecasters sift
through huge simulated data with 40 parameters, like wind, pressure,
temperature, to predict visibility at the airport.
Solution
MeteoSwiss is investigating the use of deep learning to forecast type of fog and
visibility at sub-km scale at Zurich airport.
Impact
Forecasting Fog at Zurich Airport
Multiple Examples of AI for earthquake prediction are
underway
Earthquake Prediction
Shaazam for Earthquakes

More Related Content

What's hot

01 From K to Fugaku
01 From K to Fugaku01 From K to Fugaku
01 From K to FugakuRCCSRENKEI
 
A CGRA-based Approach for Accelerating Convolutional Neural Networks
A CGRA-based Approachfor Accelerating Convolutional Neural NetworksA CGRA-based Approachfor Accelerating Convolutional Neural Networks
A CGRA-based Approach for Accelerating Convolutional Neural NetworksShinya Takamaeda-Y
 
Data-Centric Parallel Programming
Data-Centric Parallel ProgrammingData-Centric Parallel Programming
Data-Centric Parallel Programminginside-BigData.com
 
04 New opportunities in photon science with high-speed X-ray imaging detecto...
04 New opportunities in photon science with high-speed X-ray imaging  detecto...04 New opportunities in photon science with high-speed X-ray imaging  detecto...
04 New opportunities in photon science with high-speed X-ray imaging detecto...RCCSRENKEI
 
DATE 2020: Design, Automation and Test in Europe Conference
DATE 2020: Design, Automation and Test in Europe ConferenceDATE 2020: Design, Automation and Test in Europe Conference
DATE 2020: Design, Automation and Test in Europe ConferenceLEGATO project
 
A Library for Emerging High-Performance Computing Clusters
A Library for Emerging High-Performance Computing ClustersA Library for Emerging High-Performance Computing Clusters
A Library for Emerging High-Performance Computing ClustersIntel® Software
 
Barcelona Supercomputing Center, Generador de Riqueza
Barcelona Supercomputing Center, Generador de RiquezaBarcelona Supercomputing Center, Generador de Riqueza
Barcelona Supercomputing Center, Generador de RiquezaFacultad de Informática UCM
 
HTCC poster for CERN Openlab opendays 2015
HTCC poster for CERN Openlab opendays 2015HTCC poster for CERN Openlab opendays 2015
HTCC poster for CERN Openlab opendays 2015Karel Ha
 
Understand and Harness the Capabilities of Intel® Xeon Phi™ Processors
Understand and Harness the Capabilities of Intel® Xeon Phi™ ProcessorsUnderstand and Harness the Capabilities of Intel® Xeon Phi™ Processors
Understand and Harness the Capabilities of Intel® Xeon Phi™ ProcessorsIntel® Software
 
Use C++ and Intel® Threading Building Blocks (Intel® TBB) for Hardware Progra...
Use C++ and Intel® Threading Building Blocks (Intel® TBB) for Hardware Progra...Use C++ and Intel® Threading Building Blocks (Intel® TBB) for Hardware Progra...
Use C++ and Intel® Threading Building Blocks (Intel® TBB) for Hardware Progra...Intel® Software
 
BKK16-404B Data Analytics and Machine Learning- from Node to Cluster
BKK16-404B Data Analytics and Machine Learning- from Node to ClusterBKK16-404B Data Analytics and Machine Learning- from Node to Cluster
BKK16-404B Data Analytics and Machine Learning- from Node to ClusterLinaro
 
How to Design Scalable HPC, Deep Learning, and Cloud Middleware for Exascale ...
How to Design Scalable HPC, Deep Learning, and Cloud Middleware for Exascale ...How to Design Scalable HPC, Deep Learning, and Cloud Middleware for Exascale ...
How to Design Scalable HPC, Deep Learning, and Cloud Middleware for Exascale ...inside-BigData.com
 
Ai Forum at Computex 2017 - Keynote Slides by Jensen Huang
Ai Forum at Computex 2017 - Keynote Slides by Jensen HuangAi Forum at Computex 2017 - Keynote Slides by Jensen Huang
Ai Forum at Computex 2017 - Keynote Slides by Jensen HuangNVIDIA Taiwan
 
Hardware Acceleration for Machine Learning
Hardware Acceleration for Machine LearningHardware Acceleration for Machine Learning
Hardware Acceleration for Machine LearningCastLabKAIST
 
HPC Accelerating Combustion Engine Design
HPC Accelerating Combustion Engine DesignHPC Accelerating Combustion Engine Design
HPC Accelerating Combustion Engine Designinside-BigData.com
 

What's hot (20)

Manycores for the Masses
Manycores for the MassesManycores for the Masses
Manycores for the Masses
 
01 From K to Fugaku
01 From K to Fugaku01 From K to Fugaku
01 From K to Fugaku
 
OpenPOWER System Marconi100
OpenPOWER System Marconi100OpenPOWER System Marconi100
OpenPOWER System Marconi100
 
A CGRA-based Approach for Accelerating Convolutional Neural Networks
A CGRA-based Approachfor Accelerating Convolutional Neural NetworksA CGRA-based Approachfor Accelerating Convolutional Neural Networks
A CGRA-based Approach for Accelerating Convolutional Neural Networks
 
Data-Centric Parallel Programming
Data-Centric Parallel ProgrammingData-Centric Parallel Programming
Data-Centric Parallel Programming
 
04 New opportunities in photon science with high-speed X-ray imaging detecto...
04 New opportunities in photon science with high-speed X-ray imaging  detecto...04 New opportunities in photon science with high-speed X-ray imaging  detecto...
04 New opportunities in photon science with high-speed X-ray imaging detecto...
 
DATE 2020: Design, Automation and Test in Europe Conference
DATE 2020: Design, Automation and Test in Europe ConferenceDATE 2020: Design, Automation and Test in Europe Conference
DATE 2020: Design, Automation and Test in Europe Conference
 
A Library for Emerging High-Performance Computing Clusters
A Library for Emerging High-Performance Computing ClustersA Library for Emerging High-Performance Computing Clusters
A Library for Emerging High-Performance Computing Clusters
 
Barcelona Supercomputing Center, Generador de Riqueza
Barcelona Supercomputing Center, Generador de RiquezaBarcelona Supercomputing Center, Generador de Riqueza
Barcelona Supercomputing Center, Generador de Riqueza
 
HTCC poster for CERN Openlab opendays 2015
HTCC poster for CERN Openlab opendays 2015HTCC poster for CERN Openlab opendays 2015
HTCC poster for CERN Openlab opendays 2015
 
Understand and Harness the Capabilities of Intel® Xeon Phi™ Processors
Understand and Harness the Capabilities of Intel® Xeon Phi™ ProcessorsUnderstand and Harness the Capabilities of Intel® Xeon Phi™ Processors
Understand and Harness the Capabilities of Intel® Xeon Phi™ Processors
 
Current Trends in HPC
Current Trends in HPCCurrent Trends in HPC
Current Trends in HPC
 
Use C++ and Intel® Threading Building Blocks (Intel® TBB) for Hardware Progra...
Use C++ and Intel® Threading Building Blocks (Intel® TBB) for Hardware Progra...Use C++ and Intel® Threading Building Blocks (Intel® TBB) for Hardware Progra...
Use C++ and Intel® Threading Building Blocks (Intel® TBB) for Hardware Progra...
 
BKK16-404B Data Analytics and Machine Learning- from Node to Cluster
BKK16-404B Data Analytics and Machine Learning- from Node to ClusterBKK16-404B Data Analytics and Machine Learning- from Node to Cluster
BKK16-404B Data Analytics and Machine Learning- from Node to Cluster
 
How to Design Scalable HPC, Deep Learning, and Cloud Middleware for Exascale ...
How to Design Scalable HPC, Deep Learning, and Cloud Middleware for Exascale ...How to Design Scalable HPC, Deep Learning, and Cloud Middleware for Exascale ...
How to Design Scalable HPC, Deep Learning, and Cloud Middleware for Exascale ...
 
Ai Forum at Computex 2017 - Keynote Slides by Jensen Huang
Ai Forum at Computex 2017 - Keynote Slides by Jensen HuangAi Forum at Computex 2017 - Keynote Slides by Jensen Huang
Ai Forum at Computex 2017 - Keynote Slides by Jensen Huang
 
Jg3515961599
Jg3515961599Jg3515961599
Jg3515961599
 
Hardware Acceleration for Machine Learning
Hardware Acceleration for Machine LearningHardware Acceleration for Machine Learning
Hardware Acceleration for Machine Learning
 
2. Cnnecst-Why the use of FPGA?
2. Cnnecst-Why the use of FPGA? 2. Cnnecst-Why the use of FPGA?
2. Cnnecst-Why the use of FPGA?
 
HPC Accelerating Combustion Engine Design
HPC Accelerating Combustion Engine DesignHPC Accelerating Combustion Engine Design
HPC Accelerating Combustion Engine Design
 

Similar to AI is Impacting HPC Everywhere

State-Of-The Art Machine Learning Algorithms and How They Are Affected By Nea...
State-Of-The Art Machine Learning Algorithms and How They Are Affected By Nea...State-Of-The Art Machine Learning Algorithms and How They Are Affected By Nea...
State-Of-The Art Machine Learning Algorithms and How They Are Affected By Nea...inside-BigData.com
 
A historical introduction to deep learning: hardware, data, and tricks
A historical introduction to deep learning: hardware, data, and tricksA historical introduction to deep learning: hardware, data, and tricks
A historical introduction to deep learning: hardware, data, and tricksBalázs Kégl
 
FPGAs as Components in Heterogeneous HPC Systems (paraFPGA 2015 keynote)
FPGAs as Components in Heterogeneous HPC Systems (paraFPGA 2015 keynote) FPGAs as Components in Heterogeneous HPC Systems (paraFPGA 2015 keynote)
FPGAs as Components in Heterogeneous HPC Systems (paraFPGA 2015 keynote) Wim Vanderbauwhede
 
Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...
Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...
Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...BigDataEverywhere
 
A Platform for Accelerating Machine Learning Applications
 A Platform for Accelerating Machine Learning Applications A Platform for Accelerating Machine Learning Applications
A Platform for Accelerating Machine Learning ApplicationsNVIDIA Taiwan
 
TensorFlow London: Cutting edge generative models
TensorFlow London: Cutting edge generative modelsTensorFlow London: Cutting edge generative models
TensorFlow London: Cutting edge generative modelsSeldon
 
Data Science, Machine Learning and Neural Networks
Data Science, Machine Learning and Neural NetworksData Science, Machine Learning and Neural Networks
Data Science, Machine Learning and Neural NetworksBICA Labs
 
Convolutional neural networks for speech controlled prosthetic hands
Convolutional neural networks for speech controlled prosthetic handsConvolutional neural networks for speech controlled prosthetic hands
Convolutional neural networks for speech controlled prosthetic handsMohsen Jafarzadeh
 
Science and Cyberinfrastructure in the Data-Dominated Era
Science and Cyberinfrastructure in the Data-Dominated EraScience and Cyberinfrastructure in the Data-Dominated Era
Science and Cyberinfrastructure in the Data-Dominated EraLarry Smarr
 
Real-Time Streaming Data Analysis with HTM
Real-Time Streaming Data Analysis with HTMReal-Time Streaming Data Analysis with HTM
Real-Time Streaming Data Analysis with HTMNumenta
 
Superhuman Cyberinfrastructure - Crossing the Rubicon
Superhuman Cyberinfrastructure - Crossing the RubiconSuperhuman Cyberinfrastructure - Crossing the Rubicon
Superhuman Cyberinfrastructure - Crossing the RubiconLarry Smarr
 
Classification of Big Data Use Cases by different Facets
Classification of Big Data Use Cases by different FacetsClassification of Big Data Use Cases by different Facets
Classification of Big Data Use Cases by different FacetsGeoffrey Fox
 
Intro to Neural Networks
Intro to Neural NetworksIntro to Neural Networks
Intro to Neural NetworksDean Wyatte
 
Deep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial IntelligenceDeep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial IntelligenceLukas Masuch
 
Big Sky Earth 2018 Introduction to machine learning
Big Sky Earth 2018 Introduction to machine learningBig Sky Earth 2018 Introduction to machine learning
Big Sky Earth 2018 Introduction to machine learningJulien TREGUER
 
Koss 1605 machine_learning_mariocho_t10
Koss 1605 machine_learning_mariocho_t10Koss 1605 machine_learning_mariocho_t10
Koss 1605 machine_learning_mariocho_t10Mario Cho
 
SF Big Analytics20170706: What the brain tells us about the future of streami...
SF Big Analytics20170706: What the brain tells us about the future of streami...SF Big Analytics20170706: What the brain tells us about the future of streami...
SF Big Analytics20170706: What the brain tells us about the future of streami...Chester Chen
 
Alternative Computing
Alternative ComputingAlternative Computing
Alternative ComputingShayshab Azad
 
[RakutenTechConf2013] [A-3] TSUBAME2.5 to 3.0 and Convergence with Extreme Bi...
[RakutenTechConf2013] [A-3] TSUBAME2.5 to 3.0 and Convergence with Extreme Bi...[RakutenTechConf2013] [A-3] TSUBAME2.5 to 3.0 and Convergence with Extreme Bi...
[RakutenTechConf2013] [A-3] TSUBAME2.5 to 3.0 and Convergence with Extreme Bi...Rakuten Group, Inc.
 

Similar to AI is Impacting HPC Everywhere (20)

State-Of-The Art Machine Learning Algorithms and How They Are Affected By Nea...
State-Of-The Art Machine Learning Algorithms and How They Are Affected By Nea...State-Of-The Art Machine Learning Algorithms and How They Are Affected By Nea...
State-Of-The Art Machine Learning Algorithms and How They Are Affected By Nea...
 
A historical introduction to deep learning: hardware, data, and tricks
A historical introduction to deep learning: hardware, data, and tricksA historical introduction to deep learning: hardware, data, and tricks
A historical introduction to deep learning: hardware, data, and tricks
 
FPGAs as Components in Heterogeneous HPC Systems (paraFPGA 2015 keynote)
FPGAs as Components in Heterogeneous HPC Systems (paraFPGA 2015 keynote) FPGAs as Components in Heterogeneous HPC Systems (paraFPGA 2015 keynote)
FPGAs as Components in Heterogeneous HPC Systems (paraFPGA 2015 keynote)
 
Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...
Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...
Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...
 
A Platform for Accelerating Machine Learning Applications
 A Platform for Accelerating Machine Learning Applications A Platform for Accelerating Machine Learning Applications
A Platform for Accelerating Machine Learning Applications
 
TensorFlow London: Cutting edge generative models
TensorFlow London: Cutting edge generative modelsTensorFlow London: Cutting edge generative models
TensorFlow London: Cutting edge generative models
 
Data Science, Machine Learning and Neural Networks
Data Science, Machine Learning and Neural NetworksData Science, Machine Learning and Neural Networks
Data Science, Machine Learning and Neural Networks
 
Convolutional neural networks for speech controlled prosthetic hands
Convolutional neural networks for speech controlled prosthetic handsConvolutional neural networks for speech controlled prosthetic hands
Convolutional neural networks for speech controlled prosthetic hands
 
Science and Cyberinfrastructure in the Data-Dominated Era
Science and Cyberinfrastructure in the Data-Dominated EraScience and Cyberinfrastructure in the Data-Dominated Era
Science and Cyberinfrastructure in the Data-Dominated Era
 
Real-Time Streaming Data Analysis with HTM
Real-Time Streaming Data Analysis with HTMReal-Time Streaming Data Analysis with HTM
Real-Time Streaming Data Analysis with HTM
 
Superhuman Cyberinfrastructure - Crossing the Rubicon
Superhuman Cyberinfrastructure - Crossing the RubiconSuperhuman Cyberinfrastructure - Crossing the Rubicon
Superhuman Cyberinfrastructure - Crossing the Rubicon
 
Classification of Big Data Use Cases by different Facets
Classification of Big Data Use Cases by different FacetsClassification of Big Data Use Cases by different Facets
Classification of Big Data Use Cases by different Facets
 
Intro to Neural Networks
Intro to Neural NetworksIntro to Neural Networks
Intro to Neural Networks
 
Chug dl presentation
Chug dl presentationChug dl presentation
Chug dl presentation
 
Deep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial IntelligenceDeep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial Intelligence
 
Big Sky Earth 2018 Introduction to machine learning
Big Sky Earth 2018 Introduction to machine learningBig Sky Earth 2018 Introduction to machine learning
Big Sky Earth 2018 Introduction to machine learning
 
Koss 1605 machine_learning_mariocho_t10
Koss 1605 machine_learning_mariocho_t10Koss 1605 machine_learning_mariocho_t10
Koss 1605 machine_learning_mariocho_t10
 
SF Big Analytics20170706: What the brain tells us about the future of streami...
SF Big Analytics20170706: What the brain tells us about the future of streami...SF Big Analytics20170706: What the brain tells us about the future of streami...
SF Big Analytics20170706: What the brain tells us about the future of streami...
 
Alternative Computing
Alternative ComputingAlternative Computing
Alternative Computing
 
[RakutenTechConf2013] [A-3] TSUBAME2.5 to 3.0 and Convergence with Extreme Bi...
[RakutenTechConf2013] [A-3] TSUBAME2.5 to 3.0 and Convergence with Extreme Bi...[RakutenTechConf2013] [A-3] TSUBAME2.5 to 3.0 and Convergence with Extreme Bi...
[RakutenTechConf2013] [A-3] TSUBAME2.5 to 3.0 and Convergence with Extreme Bi...
 

More from inside-BigData.com

Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...inside-BigData.com
 
Transforming Private 5G Networks
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networksinside-BigData.com
 
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...inside-BigData.com
 
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...inside-BigData.com
 
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...inside-BigData.com
 
HPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networksinside-BigData.com
 
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoringinside-BigData.com
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecastsinside-BigData.com
 
HPC AI Advisory Council Update
HPC AI Advisory Council UpdateHPC AI Advisory Council Update
HPC AI Advisory Council Updateinside-BigData.com
 
Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19inside-BigData.com
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuninginside-BigData.com
 
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODHPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODinside-BigData.com
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Accelerationinside-BigData.com
 
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficientlyinside-BigData.com
 
Scaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's EraScaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's Erainside-BigData.com
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computinginside-BigData.com
 
Introducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Clusterinside-BigData.com
 

More from inside-BigData.com (20)

Major Market Shifts in IT
Major Market Shifts in ITMajor Market Shifts in IT
Major Market Shifts in IT
 
Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...
 
Transforming Private 5G Networks
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networks
 
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
 
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
 
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
 
HPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networks
 
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecasts
 
HPC AI Advisory Council Update
HPC AI Advisory Council UpdateHPC AI Advisory Council Update
HPC AI Advisory Council Update
 
Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuning
 
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODHPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
 
State of ARM-based HPC
State of ARM-based HPCState of ARM-based HPC
State of ARM-based HPC
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Acceleration
 
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
 
Scaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's EraScaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's Era
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computing
 
Introducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Cluster
 
Overview of HPC Interconnects
Overview of HPC InterconnectsOverview of HPC Interconnects
Overview of HPC Interconnects
 

Recently uploaded

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 

Recently uploaded (20)

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 

AI is Impacting HPC Everywhere

  • 1. The Impact of AI on HPC … “Everywhere” Rob Farber CEO TechEnablement.com Contact info AT techenablement.com for consulting, teaching, writing, and other inquiries
  • 2. GPUs - From games to supercomputers to Deep Learning. CPUs – General purpose to DL devices GPUs evolved from pushing pixels CPUs evolved from running applications • Failure of Dennard’s scaling laws caused switch to multicore • Parallelism via GPUs and Intel Xeon Phi is now the norm • CPUs are now competitive! ZZZ CUDA OpenCL OpenCL 100 million+ GPUs <mumble> Larrabee zZz 1/2 Billion+ GPUs Intel Xeon Scalable KNL, KNM, …
  • 3. CPUs are also good for AI • Don’t get locked into thinking “have to have a GPU”
  • 4. Jump on the AI/Machine learning bandwagon 1940 1950 1960 1970 1980 1990 2000 2010 2020 Electronic Brain (1943) Perceptron (1957) XOR (1969) XOR Multilayer Perceptron Backprop (1986)
  • 5. “Computational Universality” An XOR ANN • The example of XOR nicely emphasizes the importance of hidden neurons: • They re-represent the input such that the problem becomes linearly separable. • Networks with hidden units can implement any Boolean function -> Computational Universal devices! • Networks without hidden units cannot learn XOR • Cannot represent large classes of problems G(x) NetTalk Sejnowski, T. J. and Rosenberg, C. R. (1986) NETtalk: a parallel network that learns to read aloud, Cognitive Science, 14, 179-211 http://en.wikipedia.org/wiki/NETtalk_(artificial_neural_network) 500 learning loops Finished "Applications of Neural Net and Other Machine Learning Algorithms to DNA Sequence Analysis", (1989).
  • 6. Big hardware idea 1: SIMD High-performance from the past • Space and power efficient • Long life via a simple model 6 16,384 GPU results 2014 courtesy ORNL Farber: general SIMD mapping : “Most efficient implementation to date” (Singer 1990), (Thearling 1995) The Connection Machine Turing TU102 (Training) $30M $1-3K 2013 Results courtesy Intel and NVIDIA 1 TF/s 13 PF/s
  • 7. A general SIMD mapping: Optimize(LMS_Error = objFunc(p1, p2, … pn)) 7 Examples 0, N-1 Examples N, 2N-1 Examples 2N, 3N-1 Examples 3N, 4N-1 Step 2 Calculate partials Step1 Broadcast parameters Optimization Method (Powell, Conjugate Gradient, Other) Step 3 Sum partials to get energy GPU 1 GPU 2 GPU 3 p1,p2, … pn p1,p2, … pn p1,p2, … pn p1,p2, … pn GPU 4 Host
  • 8. See a path to exascale (MPI can map to thousands of GPU or Processor nodes) 8 Always report “Honest Flops”
  • 9. Application to Bioinformatics Internal connections The phoneme to be pronounced NetTalk Sejnowski, T. J. and Rosenberg, C. R. (1986) NETtalk: a parallel network that learns to read aloud, Cognitive Science, 14, 179-211 http://en.wikipedia.org/wiki/NETtalk_(artificial_neural _network) Internal connections t te X A TC G T "Applications of Neural Net and Other Machine Learning Algorithms to DNA Sequence Analysis", A.S. Lapedes, C. Barnes, C. Burks, R.M. Farber, K. Sirotkin, Computers and DNA, SFI Studies in the Sciences of Complexity, vol. VII, Eds. G. Bell and T. Marr, Addison-Wesley, (1989). T|F Exon region
  • 10. Predicting binding affinity (The closer you look the greater the complexity) Electron Microscope
  • 11. The question for computational biology How do we know you are not playing expensive computer games with our money?
  • 12. Utilize a blind test (Example reference when starting my drug discovery company) Internal connections A0 Binding affinity for a specific antibody A1 A2 A3 A4 A5 Possible hexamers 206 = 64M 1k – 2k pseudo-random (hexamer, binding) affinity pairs Approx. 0.001% sampling “Learning Affinity Landscapes: Prediction of Novel Peptides”, Alan Lapedes and Robert Farber, Los Alamos National Laboratory Technical Report LA-UR-94-4391 (1994).
  • 13. Hill climbing to find high affinity Internal connections A0 𝐴𝑓𝑓𝑖𝑛𝑖𝑡𝑦 𝐴𝑛𝑡𝑖𝑏𝑜𝑑𝑦 A1 A2 A3 A4 A5 Learn: 𝐴𝑓𝑓𝑖𝑛𝑖𝑡𝑦 𝐴𝑛𝑡𝑖𝑏𝑜𝑑𝑦 = 𝑓 𝐴0, … , 𝐴5 𝑓(F,F,F,F,F,F) 𝑓(F,F,F,F,F,L) 𝑓(F,F,F,F,F,V) 𝑓(F,F,F,F,L,L) 𝑓(P,C,T,N,S,L) Predict P,C,T,N,S,L has the highest binding affinity Confirm experimentally
  • 14. Time series Iterate 𝑋𝑡+1 = 𝑓 𝑋𝑡, 𝑋𝑡−1, 𝑋𝑡−2, … 𝑋𝑡+2 = 𝑓 𝑋𝑡+1, 𝑋𝑡, 𝑋𝑡−1, … 𝑋𝑡+3 = 𝑓 𝑋𝑡+2, 𝑋𝑡+1, 𝑋𝑡, … 𝑋𝑡+4 = 𝑓 𝑋𝑡+3, 𝑋𝑡+2, 𝑋𝑡+1, … Internal connections Xt Xt-1 Learn: 𝑋𝑡+1 = 𝑓 𝑋𝑡, 𝑋𝑡−1, 𝑋𝑡−2, … Xt-2 Xt-3 Xt-4 Xt-5 Xt+1 Works great! (better than other methods at that time) "How Neural Nets Work", A.S. Lapedes, R.M. Farber, reprinted in Evolution, Learning, Cognition, and Advanced Architectures, World Scientific Publishing. Co., (1987).
  • 15. Designing ANNs for Integration and Bifurcation analysis – “training a netlet” "Identification of Continuous-Time Dynamical Systems: Neural Network Based Algorithms and Parallel Implementation", R. M. Farber, A. S. Lapedes, R. Rico-Martinez and I. G. Kevrekidis, Proceedings of the 6th SIAM Conference on Parallel Processing for Scientific Computing, Norfolk, Virginia, March 1993. ANN schematic for continuous-time identification. (a) A four-layered ANN based on a fourth order Runge-Kutta integrator. (b) ANN embedded in a simple implicit integrator. (a) Periodic attractor of the Van der Pol oscillator for g= 1.0, d= 4.0 and w = 1.0. The unstable steady state in the interior of the curve is marked +. (b) ANN-based predictions for the attractors of the Van der Pol oscillator shown in (a).
  • 16. Dimension reduction • The curse of dimensionality • People cannot visualize data beyond 3D + color • Search volume rapidly increases with dimension • Queries return too much data or no data I I I I I B I I I I I B I I I I I B I I I I I B I I I I I B I I I I I B I I I I I B I I I I I B I I I I I B I I I I I B I I I I I B I I I I I B I I I I I B I I I I I B I I I I I B Sensor 1 Sensor 2 Sensor 3 Sensor N Sensor 1 Sensor 2 Sensor 3 Sensor N Sensor 1 Sensor 2 Sensor 3 Sensor N Sensor 1 Sensor 2 Sensor 3 Sensor N Sensor 1 Sensor 2 Sensor 3 Sensor N X Y Z
  • 17. Jump on the AI/Machine learning bandwagon 1940 1950 1960 1970 1980 1990 2000 2010 2020 Electronic Brain (1943) Perceptron (1957) XOR (1969) XOR Multilayer Perceptron Backprop (1986) (2007) Deep Neural Network (2006) AGAIN! Wildly Overpromised
  • 18. All the previous work is now orders of magnitude faster! • Time series • Bifurcation analysis • Locally Weighted Linear Regression (LWLR) • Neural Networks • Naive Bayes (NB) • Gaussian Discriminative Analysis (GDA) • k-means • Logistic Regression (LR) • Independent Component Analysis (ICA) • Expectation Maximization (EM) • Support Vector Machine (SVM) • Others: (MDS, Ordinal MDS, etcetera) People are “rediscovering” ANNs in an Exascale framework
  • 19. This is not an AI history talk – Deep learning is commercially viable encompassing much of what we do Speech recognition in noisy environments (Siri, Cortana, Google, Baidu, …) Better than human face recognition Self-driving cars • Internet Search • Robotics • Self guiding drones • Much, much, more Plus everything discussed previously: time series, system modeling, drug discover, and more
  • 20. A change in the scientific mindset: ANNs in physical models • Previously: ANNs are black boxes – how can you guarantee they won’t add non-physical artifacts to my model? • Today: CERN has shown that GANs (Generative Adversarial Networks) can learn physics based distributions that cannot be distinguished from real distributions Longitudinal shower shape for 400 GeV electron Longitudinal shower shape for 100 GeV electron It does not matter which generator one uses so long as the generated distributions are indistinguishable from the real data distributions – Sophia Vallecorsa, CERN physist Openlab Time to generate an electron shower: Full Monte Carlo 17000.00 msec GAN running on GTX 1080 00000.04 msec
  • 21. ANNs evaluate simulation results • Deep learning ANNs recognize storm types in faster-than-realtime climate models. • Too much data for a human • Validation via historical data • Evaluate storm distributions in the future predictions
  • 22. Expect significant algorithm and HW retooling Today: General-purpose devices like CPUs and GPUs Tomorrow: FPGA and ASIC Data Flow devices Near term: Neuromorphic will change everything Far term: Quantum: Seems to be ideal for training (once we have the qubits CPU GPU FPGA Chips Roughly four different camps
  • 23. Ubiquitous Low precision AI hardware is changing HPC • NLAFET: Reduced precision BLAS • Synchronization-reducing algorithms • Communication-reducing algorithms • Mixed precision methods (2x speep of ops and 2x speed for data movement) • Autotuning • Fault resilient algorithms • Reproducibility of results • LIBXSMM: batched reduced precision small matrix operations => very high percentage of peak! Scientists reexamining precision • The HiCMA library: million by million matrix operations in gigabytes of workstation memory
  • 24. DAGs are also changing HPC • NLAFET uses DAG scheduling to support CPUs, GPUs, and Data Flow architectures
  • 25. From HPC centers to IoT edge devices HPC centers will soon be part of a data network, not the center of the HPC effort. • Read the BDEC report (Big Data and Extreme-scale Computing) • http://www.exascale.org/bdec/sites/www.exascale.org.bdec/files/whitepapers/bdec_pathways.pdf. • Look at Frontera at TACC – an “agile” HPC supercomputer! • Largest academic supercomputer at a University • Leverages the older Wrangler effort data analytics system • The Argonne lab-wide data service based on the Petrel service. • More …
  • 26. Not bad for surface fitting – but what’s next? • Lapedes and Farber, “How Neural Networks Work,” showed that during training the neural network is actually fitting a ”bumpy” multi- dimensional surface. • Inference: – Interpolates between points on the surface, or – Extrapolates out from points on the surface. Birthday card: “Herpes Bathtub!” Inside the card: “Happy Birthday! (Damn voice recognition)”
  • 27. Neuromorphic – the next step! • “[We] demonstrate that neuromorphic computing … can implement deep convolution networks that approach state-of-the-art classification accuracy across eight standard datasets encompassing vision and speech, perform inference while preserving the hardware’s underlying energy-efficiency and high throughput.” Accuracy of different sized networks running on one or more TrueNorth chips to perform inference on eight datasets. For comparison, accuracy of state-of-the-art unconstrained approaches are shown as bold horizontal lines (hardware resources used for these networks are not indicated).
  • 28. Build big and very, very power efficient networks (e.g. “brains”) A system roughly the neuronal size of a rat brain PNAS (8/9/16) Convolutional networks for fast, energy-efficient neuromorphic computing, Steven K. Essera et. al. • Use integrate-and-fire spiking neurons • Data sets run between 1,200 and 2,600 frames/s and using between 25 and 275 mW • (effectively >6,000 frames/s per watt)
  • 29. DATA It’s easy “From “Hello World” to exascale” double objFunc( ... ) { double err=0.; #pragma acc parallel loop reduction(+:err) #pragma omp parallel for reduction(+ : err) { err = 0.; for(int i=0; i < nExamples; i++) { // transform float d=myFunc(i, param, example, nExamples, NULL); //reduce err += d*d; } } return sqrt(err); } int main() { cout << "Hello World" << endl; // load data and initialize parameters init(); #pragma acc data copyin(param[0:N_PARAM-1]) pcopyin(example[0:nExamples*EXAMPLE_SIZE-1]) { optimize( objFunc ); // the optimizer calls the objective function } return 0; }
  • 30. Usage Taxonomy Organizing HPC + AI Convergence Operation HPC + AI couple simulation with live data in real time detection/control system Experimental/simulated data is used to train a NN, where resulting inference enging is used to for real-time detection/control of an experiment or clinical delivery system The NN is improved as new simulated / live data is acquired Experimental/simulated data is used to train a NN that is used to replace all or significant runtime portions of a conventional simulation The NN is improved continuously as new simulated / live data is acquired Experimental/simulated data used to train a NN which steers simulation/experiment btwn runs The steering NN can be trained continuously as new simulated / live data is acquired Augmentation HPC + AI combined to improve simulation time to science > orders of magnitude Modulation HPC + AI combined to reduce the number of runs needed for a parameter sweep Potential for Breakthroughs in Scientific Insight
  • 31. So much more, you have been great, Thank You! Rob Farber CEO TechEnablement.com Contact info AT techenablement.com for consulting, teaching, writing, and other inquiries
  • 32. Numerous use cases I don’t have time to present
  • 33. Background The aLIGO (Advanced Laser Interferometer Gravitational Wave Observatory) experiment successfully discovered signals proving Einstein’s theory of General Relativity and the existence of cosmic Gravitational Waves. While this discovery was by itself extraordinary it is a step to the ultimate goal to combine multiple observational data sources that not only hear but also see to the complete spectrum of data in real time. Challenge The initial a LIGO discoveries were successfully completed using classic data analytics. The processing pipeline used hundreds of CPU’s where the bulk of the detection processing was done offline. The latency is far outside the range needed to activate resources, such as optical, infrared or radio telescopes which observe phenomena in the electromagnetic spectrum in time to “see” what aLIGO can “hear”. Solution A DNN was developed and trained with simulated data and verified using from the CACTUS/Einstein Toolkit. The DNN was shown to produce better accuracy with latencies 4500x faster than the original CPU based pattern matching waveform detection. Impact Faster and more accurate detection of gravitational waves with the potential to steer other observational data sources. “HEARING” GRAVITY IN REAL TIME
  • 34. Background The “Grand Challenge” of fusion energy would offer the humankind changing opportunity to provide clean, safe energy for millions of years. ITER is a $25B international experiment to develop the prototype to demonstrate commercially viable fusion reactor. Challenge The plasma in a fusion reactor is highly turbulent at the edges of the flow, and disruptions can occur that break the magnetic confinement, which can cause damage to the physical reactor It is critical to predict when a disruption will occur to prevent damage and maintain safe operation. Traditional simulation and ML approaches were 65% to 85% accurate with 5% false alarm rate Solution DL network called FRNN using Theano exceeds today's best accuracy results. It scales to 200 Tesla K20s, and with more GPUs, can deliver higher accuracy. Current level of accuracy is 95% prediction with 5% false alarm rate. Impact Vision is to operate ITER with FRNN, operating and steering experiments in real-time to minimize damage and down-time. CONVERGED HPC SPEEDING THE PATH TO FUSION ENERGY
  • 35. UNDERSTANDING THE STANDARD MODEL The simulation for the LHC particle collider is known as GEANT and it is numerically intensive, requires significant compute cycles as part of the scientific workflow and is extremely hard to optimize for modern CPU or GPU type architectures. The High Luminosity LHC is expected to generate 10X the volume of data, and the compute load on GEANT is in turn expected to challenge future computing systems within a flat budget profile. A GAN (caloGAN) was constructed by a team at Lawrence Berkeley Lab, CERN and Yale that was customized for Calorimiter Experiments similar to ATLAS in the LHC. A training data set was developed using 100,000 GEANT events as input. 50 epochs were then used on 18xK80’s to train the GAN with KERAS and TensorFlow. Impact The CaloGAN is more accurate and showed up to 5 Orders of Magnitude performance speed up relative to the original simulation on a single K80 GPU
  • 36. Background It takes 14 years and $2.5 Billon to develop 1 drug Higher than 99.5% failure rate after the drug discovery phase Challenge QC simulation is computationally expensive - it takes 5 years to compute on CPUs So researchers use approximations, compromising on accuracy. To screen 10M drug candidates,. Solution Researchers at the University of Florida and the University of North Carolina leveraged GPU deep learning to develop a custom framework ANAKIN-ME, to reproduce molecular energy surfaces with super speed (microseconds versus several minutes), extremely high (DFT) accuracy, and at up to 6 orders of magnitude improvement in speed. Impact Speed and accuracy could start a revolution in computational chemistry — and forever change the way we discover the medicines of the future CONVERGED HPC REVOLUTIONIZING DRUG DISCOVERY
  • 37. Converged HPC Accelerates Drug Discovery The “drug discovery” phase of the development process involves exploring all the different possible combinations of protein molecules (targets) and drug chemical compounds to ensure the drug will do what it’s designed to do. Classic Molecular Dynamics simulations are very time-consuming and expensive. Machine Learning models have been designed to help predict probability of the target molecules interacting with the drug chemical compounds, but still require significantly more performance to deliver improved accuracy. Researchers developed and trained a convolutional neural network accelerated with NVIDIA GPU’s to improve the model performance and prediction accuracy. Ultimately, they improved prediction accuracy from approximately 52% to 70% compared to other machine learning-based models (Vina Docking). (35% relative improvement)
  • 38. HUNTING “GHOST PARTICLES” WITH DEEP LEARNING Background The NoVA experiment managed by Fermi lab comprises 200 scientists at 40 institutions in 7 countries. The goal is to track neutrino’s, which are often referred to as the “Ghost Particle”, and detect oscillation which is used to better understand how this super abundant, and elusive particle interacts with matter. Challenge The experiment is built underground and is comprised of a main injector beam and two large detector apparatus located 50 miles apart. The near detector is 215 Tons and the Far detector is 15,000 Tons. The experiment can be thought of as a 30 Mn pound detector that takes 2 Mn pictures per second. The detectability of the current experiment is proportional to the size of the detectors, so increasing the “visibility” is complex and costly. Solution A DNN was developed and trained using a data set derived from multiple HPC simulations including GENIE and GEANT using 2 K40 GPU’s. the CVN was basd on convolutional neural networks used for image processing Impact The result was an overall improvement of 33% in detector mass, where the optimized CVN signal-detection-optimized efficiency of 49% is a significant gain over the efficiency of 35% quoted in prior art. This would yield a net effect of a10Mn pound increase the physical detector mass
  • 39. AN AI MONITOR OF EARTH’S VITALS The Earth’s climate has changed throughout history, but in recent years there have been record increases in temperature, glacial retreat and rising sea levels. NASA Ames is using satellite imagery to measure the effects of carbon and greenhouse gas emissions on the planet. To do so, they developed DeepSat — a deep learning framework for satellite image classification trained on a GPU-powered supercomputer. The enhanced satellite imagery will help scientists plan to protect ecosystems and farmers improve crop production.
  • 40. AI Sheds Light on Key Mysteries of the Universe Gravitational Lensing generates an image of a far away galaxy that is distorted into rings and arcs by the gravity of a massive object, such as galaxy cluster. The distortions provide important clues about how mass is distributed in space and how that distribution changes over time, to measure properties of dark matter, size of galaxies, and the expansion of the universe. Today there are roughly 200 known gravitational lenses. However, scientist expect to uncover 200,000 gravitational lenses in the next decade. Using traditional approaches, it can take anywhere from 2 days to 3 months to analyze a single lens image. SLAC scientists have for the first time used artificial neural networks to analyze gravitational lenses in just 10ms. This will allow researchers to unlock many of the unsolved mysteries of the universe.
  • 41. AI Accelerates the Production of Ultra Cold Gases Bose-Einstein Condensate (BEC) is a state of matter formed by cooling a gas to near-zero absolute temperature. BEC is achieved by controlling the intensity of the lasers to trap only the ultra-cold atoms and allowing other atoms to escape. BECs are super sensitive to external disturbances. This makes them suitable for very precise measurements of things like tiny changes in Earth’s magnetic field or gravity. Researchers at the University of North South Whales used AI to create a BEC gas 14 times faster than conventional methods.
  • 42. Background Unexpected fog can cause an airport to cancel or delay flights, sometimes having global effects in flight planning. Challenge While the weather forecasting model at MeteoSwiss work at a 2km x 2km resolution, runways at Zurich airport is less than 2km. So human forecasters sift through huge simulated data with 40 parameters, like wind, pressure, temperature, to predict visibility at the airport. Solution MeteoSwiss is investigating the use of deep learning to forecast type of fog and visibility at sub-km scale at Zurich airport. Impact Forecasting Fog at Zurich Airport
  • 43. Multiple Examples of AI for earthquake prediction are underway Earthquake Prediction Shaazam for Earthquakes