SlideShare une entreprise Scribd logo
1  sur  56
Télécharger pour lire hors ligne
Shared-Memory Heterogeneous Computing
for HPC and AI
H. Peter Hofstee, Ph.D.
IBM ( & TU Delft )
IIIT-B May 3 2019
2
© 2006 IBM Corporation
© 2006 IBM Corporation4
Cell Architecture is …
COHERENT BUS
Power
ISA
MMU/BIU
Power
ISA
MMU/BIU
…
IO
transl.
Memory
Incl. coherence/memory
compatible with 32/64b Power Arch. Applications and OS’s
64b Power Architecture™
© 2006 IBM Corporation5
Cell Architecture is … 64b Power Architecture™
COHERENT BUS
Power
ISA
+RMT
MMU/BIU
+RMT
Power
ISA
+RMT
MMU/BIU
+RMT
IO
transl.
Memory
Plus
Memory
Flow Control (MFC)
MMU/DMA
+RMT
Local Store
Memory
MMU/DMA
+RMT
Local Store
Memory
LS Alias
LS Alias
…
…
…
© 2006 IBM Corporation6
Cell Architecture is … 64b Power Architecture™+ MFC
COHERENT BUS
Power
ISA
+RMT
MMU/BIU
+RMT
Power
ISA
+RMT
MMU/BIU
+RMT
IO
transl.
Memory
Plus
Synergistic
Processors
MMU/DMA
+RMT
Local Store
Memory
MMU/DMA
+RMT
Local Store
Memory
LS Alias
LS Alias
…
…
…
Syn.
Proc.
ISA
Syn.
Proc.
ISA
© 2006 IBM Corporation
© 2006 IBM Corporation8
Cell BE based Systems: SCEI, IBM, Mercury, …
© 2006 IBM Corporation9
Advanced
Cell
Blade
Prototype
QS20
Advanced
Cell-BE
Based
Blade
Prototype
© 2006 IBM Corporation
© 2006 IBM Corporation
© 2006 IBM Corporation
© 2006 IBM Corporation
© 2006 IBM Corporation
Next Step: OpenMP 4.0 and FPGAs
Fundamental forces are accelerating change in our industryPrice/Performance
Full system stack innovation required
Moore’s Law
IT innovation can no longer come
from just the processor
Cognitive
Custom Hyperscale
Data Centers
Hybrid Cloud
Open Solutions
IT consumption models
are expanding
Technology and
Processors
2000 2020
Firmware / OS
Accelerators
Software
Storage
NetworkFull Stack
Acceleration (Lower is
better)
18
POWER8 Architecture POWER9 Architecture
2014
POWER8
12 cores
22nm
New Micro-
Architecture
New Process
Technology
2016
POWER8
w/ NVLink
12 cores
22nm
Enhanced
Micro-
Architecture
With NVLink
2017
P9 SO
24 cores
14nm
New Micro-
Architecture
Direct attach
memory
New Process
Technology
2018
P9 SU
24 cores
14nm
Enhanced
Micro-
Architecture
Buffered
Memory
POWER7 Architecture
2010
POWER7
8 cores
45nm
New Micro-
Architecture
New Process
Technology
2012
POWER7+
8 cores
32nm
Enhanced
Micro-
Architecture
New Process
Technology
2020+
P10
TBD cores
New Micro-
Architecture
New
Technology
POWER10
2019
P9
w/ Adv. I/O
24 cores
14nm
Enhanced
Micro-
Architecture
New
Memory
Subsystem
Up To
150 GB/s
PCIe Gen4 x48
192GB/s
25 GT/s
300GB/s
CAPI 2.0,
OpenCAPI3.0,
NVLink2.0
Sustained Memory Bandwidth
Standard I/O Interconnect
Advanced I/O Signaling
Advanced I/O Architecture
Up To
210 GB/s
PCIe Gen4 x48
25 GT/s
300GB/s
CAPI 2.0,
OpenCAPI3.0,
NVLink2.0
Up To
350 GB/s
PCIe Gen4 x48
25 GT/s
300GB/s
CAPI 2.0,
OpenCAPI4.0,
NVLink3.0
Up To
435 GB/s
PCIe Gen5
32 & 50 GT/s
TBD
Up To
210 GB/s
PCIe Gen3
N/A
CAPI 1.0
Up To
210 GB/s
PCIe Gen3
20 GT/s
160GB/s
CAPI 1.0 ,
NVLink 1.0
Up To
65 GB/s
PCIe Gen2
N/A
N/A
Up To
65 GB/s
PCIe Gen2
N/A
N/A
Statement of Direction, Subject to Change 19
20
J. Stuecheli, IBM: OpenPOWER Summit Europe 2018
21
Wistron Power9 MiHawkPower9 Zaius/Barreleye G2
1Tb/s (10x 100Gb/s) demo!
22
WISTRON “MiHawk”
24 x NVMe = 96 lanes Gen3 PCIe = 48 lanes Gen4 PCIe = 32 lanes OpenCAPI 3.0
Image Source: Wistron
OpenCAPI !
23
400Gb/s
(x2)
~400Gb/s
(x2)
400Gb/s
(x2)
~400Gb/s
(x2)
POWER9
POWER9
500Gb/s
(x2)
400Gb/s
(x2)
400Gb/s
(x2)
1-2TB DDR4 Memory
1-2TB DDR4 Memory
>1000Gb/s
>1000Gb/s
800Gb/s
(x2)
Option
VU37p
(400+GB/s HBM)
VU37p
(400+GB/s HBM)
24
Source: AlphaData
AlphaData ‘9H7 and ‘9V3 with OpenCAPI !
Source: IBM
25
https://www.xilinx.com/support/documentation/white_papers/wp485-hbm.pdf
26
Apache Spark
JVM
Memory
(off-heap)
Serialize /
Deserialize
Network
Disk
A ccelerat or
Native
library
A pplicat ion
Python
libary
Apache Spark
JVM
Shared data set / memory in Arrow format
NetworkStorage
FPGA
A ccelerat or
Native
library
A pplicat ion
Python
tool
A pache A rrow libraries Flet cher
J. Peltenburg, e.a., TU Delft ( OpenPOWER Summit USA 2018 )
CAPI
...
......
Host
AXI Interconnect 2:1
AWS F1 Shell
PCIe
DDR
Controller
DDR
SDRAM
(on board)
Fletcher Interconnect (N:1)
Column
Reader
AXI4AXI4
Lite
Bus master side
General dataflow
MMIO
DMA
0 R-1
Accumulators
Regexunit0
Column
Reader
0 R-1
RegexunitN-1
Regex matcher
R no. matches
Layer A
Layer B
AW S EC2 F1
Host Memory
CAPP
SNAP
POW ER8 CA PI
AXI4
Lite
MMIO
AXI4
PSL
PCIe
Off-board components
Non-FPGA components
27
R=16 different regular expressions per unit
AWS EC2 F1:
• Virtex Ultrascale+
• N=16 regex units
• 256 regexes being matched in parallel
POWER8 CAPI (Supervessel, & soon at Nimbix):
• AlphaData KU3 (Kintex Ultrascale)
• N=8 regex units
• 128 regex being matched in parallel
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
log2(Bytes)
0
1
2
3
GB/s
AWS EC2 F1 (16 units)
POWER8+CAPI (8 units)
28
29
A closer look at
Summit & Sierra
#1 & #2 in HPC
… and > 3 ExaOp AI !
Weather GRAF video or still image
31
IBM Global High-Resolution Athmospheric Forecasting System
32
© 2018 IBM Corporation
5x Faster Data Communication with Unique
CPU-GPU NVLink High-Speed Connection
1TB
Memory
Power 9
CPU
V100 GPU V100 GPU
170GB/s
NVLink
150 GB/s
1TB
Memory
Power 9
CPU
V100 GPU V100 GPU
170GB/s
NVLink
150 GB/s
IBM AC922 Power System
Deep Learning Server (4-GPU Config)
Store Large Models
in System Memory
Operate on One
Layer at a Time
Fast Transfer
via NVLink 64 GB/s(x2)
NVLink
150 GB/s
NVLink
150 GB/s
NVLink
150 GB/s
NVLink
150 GB/s
33Source: NVIDIA
34Source: NVIDIA
https://developer.ibm.com/linuxonpower/perfcol/perfcol-technical/
https://developer.ibm.com/linuxonpower/perfcol/perfcol-technical/
37
Source: L. Grinberg, OpenPOWER Summit Europe
38
Source: L. Grinberg, OpenPOWER Summit Europe
IBM Open Source Based AI Stack
39
Accelerated AC922
Power9 Servers
Storage
(Spectrum Scale ESS)
Watson
Studio
SnapML
WML CE
Runtime Environment
Train, Deploy, Manage Models
Watson
OpenScale
Model Metrics,
Bias, and Fairness
Monitoring
Watson
Machine Learning
Watson ML CE
Watson ML Accelerator
Data Preparation
Model Development
Environment
Auto-AI software: PowerAI Vision, IBM Auto-AI
Previous Names:
WML Accelerator = PowerAI Enterprise
WML Community Ed. = PowerAI-base
Runs on x86 & other storage too
40
© 2018 IBM Corporation
3.1 Hours
49 Mins
0
2000
4000
6000
8000
10000
12000
Xeon x862640v4w/ 4x V100 GPUs Power AC922 w/ 4x V100 GPUs
Time(secs)
Caffe with LMS (Large Model Support)
Runtime of 1000 Iterations
3.8x Faster
GoogleNet model on Enlarged
ImageNet Dataset (2240x2240)
41
Source IBM Research, Zurich
3D U-Net on Tensorflow with Large Memory Support
+1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
+1
-1
-1
-1
-1
-1
-1
+1
-1
4.2 billion
examples
1 million
features
Goal: Predict whether a user will click on a given advert based
on an anonymized set of features.
Train: Fit model parameters using 4.2 billion examples.
Inference: Evaluate model on 180 million unseen examples.
+1 – click
-1 – no click
Sparse data
matrix
2.3TB
labels
Criteo Labs. 2015. Criteo Releases Industry s Largest-Ever Dataset for Machine Learning to Academic
Community. h ps://www.criteo.com/news/press-releases/2015/07/criteo-releases-industrys-largest-ever-dataset/
*
*
43
Source IBM Research, Zurich
44
POWER8 Architecture POWER9 Architecture
2014
POWER8
12 cores
22nm
New Micro-
Architecture
New Process
Technology
2016
POWER8
w/ NVLink
12 cores
22nm
Enhanced
Micro-
Architecture
With NVLink
2017
P9 SO
24 cores
14nm
New Micro-
Architecture
Direct attach
memory
New Process
Technology
2018
P9 SU
24 cores
14nm
Enhanced
Micro-
Architecture
Buffered
Memory
POWER7 Architecture
2010
POWER7
8 cores
45nm
New Micro-
Architecture
New Process
Technology
2012
POWER7+
8 cores
32nm
Enhanced
Micro-
Architecture
New Process
Technology
2020+
P10
TBD cores
New Micro-
Architecture
New
Technology
POWER10
2019
P9
w/ Adv. I/O
24 cores
14nm
Enhanced
Micro-
Architecture
New
Memory
Subsystem
Up To
150 GB/s
PCIe Gen4 x48
25 GT/s
300GB/s
CAPI 2.0,
OpenCAPI3.0,
NVLink2.0
Sustained Memory Bandwidth
Standard I/O Interconnect
Advanced I/O Signaling
Advanced I/O Architecture
Up To
210 GB/s
PCIe Gen4 x48
25 GT/s
300GB/s
CAPI 2.0,
OpenCAPI3.0,
NVLink2.0
Up To
350 GB/s
PCIe Gen4 x48
25 GT/s
300GB/s
CAPI 2.0,
OpenCAPI4.0,
NVLink3.0
Up To
435 GB/s
PCIe Gen5
32 & 50 GT/s
TBD
Up To
210 GB/s
PCIe Gen3
N/A
CAPI 1.0
Up To
210 GB/s
PCIe Gen3
20 GT/s
160GB/s
CAPI 1.0 ,
NVLink 1.0
Up To
65 GB/s
PCIe Gen2
N/A
N/A
Up To
65 GB/s
PCIe Gen2
N/A
N/A
Statement of Direction, Subject to Change 45
46
J. Stuecheli, IBM: OpenPOWER Summit Europe 2018
47
48
49
50
51
52
53
Copyright © 2019 by International Business Machines Corporation. All rights reserved.
No part of this document may be reproduced or transmitted in any form without written permission from IBM Corporation.
Product data has been reviewed for accuracy as of the date of initial publication. Product data is subject to change without notice. This document could include technical inaccuracies or
typographical errors. IBM may make improvements and/or changes in the product(s) and/or program(s) described herein at any time without notice. Any statements regarding IBM's future
direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only. References in this document to IBM products, programs, or services does
not imply that IBM intends to make such products, programs or services available in all countries in which IBM operates or does business. Any reference to an IBM Program Product in this
document is not intended to state or imply that only that program product may be used. Any functionally equivalent program, that does not infringe IBM's intellectually property rights, may be
used instead.
THE INFORMATION PROVIDED IN THIS DOCUMENT IS DISTRIBUTED "AS IS" WITHOUT ANY WARRANTY, EITHER OR IMPLIED. IBM LY DISCLAIMS ANY WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR NONINFRINGEMENT. IBM shall have no responsibility to update this information. IBM products are warranted, if at
all, according to the terms and conditions of the agreements (e.g., IBM Customer Agreement, Statement of Limited Warranty, International Program License Agreement, etc.) under which
they are provided. Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has
not tested those products in connection with this publication and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. IBM makes no
representations or warranties, ed or implied, regarding non-IBM products and services.
The provision of the information contained herein is not intended to, and does not, grant any right or license under any IBM patents or copyrights. Inquiries regarding patent or copyright
licenses should be made, in writing, to:
IBM Director of Licensing
IBM Corporation
North Castle Drive
Armonk, NY 1 0504- 785
U.S.A.
| 54
| 55
IBM, the IBM logo, ibm.com, IBM System Storage, IBM Spectrum Storage, IBM Spectrum Control, IBM Spectrum Protect, IBM Spectrum Archive, IBM Spectrum Virtualize, IBM Spectrum Scale, IBM Spectrum Accelerate, Softlayer, and XIV are
trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at http://www.ibm.com/legal/copytrade.shtml
The following are trademarks or registered trademarks of other companies.
Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries.
IT Infrastructure Library is a Registered Trade Mark of AXELOS Limited.
Linear Tape-Open, LTO, the LTO Logo, Ultrium, and the Ultrium logo are trademarks of HP, IBM Corp. and Quantum in the U.S. and other countries.
Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other
countries.
Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.
Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both.
Java and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or its affiliates.
Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc. in the United States, other countries, or both and is used under license therefrom.
ITIL is a Registered Trade Mark of AXELOS Limited.
UNIX is a registered trademark of The Open Group in the United States and other countries.
* All other products may be trademarks or registered trademarks of their respective companies.
Notes:
Performance is in Internal Throughput Rate (ITR) ratio based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput that any user will experience will vary depending upon considerations
such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput improvements
equivalent to the performance ratios stated here.
All customer examples cited or described in this presentation are presented as illustrations of the manner in which some customers have used IBM products and the results they may have achieved. Actual environmental costs and performance
characteristics will vary depending on individual customer configurations and conditions.
This publication was produced in the United States. IBM may not offer the products, services or features discussed in this document in other countries, and the information may be subject to change without notice. Consult your local IBM business
contact for information on the product or services available in your area.
All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only.
Information about non-IBM products is obtained from the manufacturers of those products or their published announcements. IBM has not tested those products and cannot confirm the performance, compatibility, or any other claims related to non-IBM
products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.
Prices subject to change without notice. Contact your IBM representative or Business Partner for the most current pricing in your geography.
This presentation and the claims outlined in it were reviewed for compliance with US law. Adaptations of these claims for use in other geographies must be reviewed
by the local country counsel for compliance with local laws.
| 56
This document was developed for IBM offerings in the United States as of the date of publication. IBM may not make these offerings available in other countries, and the information is
subject to change without notice. Consult your local IBM business contact for information on the IBM offerings available in your area.
Information in this document concerning non-IBM products was obtained from the suppliers of these products or other public sources. Questions on the capabilities of non-IBM products
should be addressed to the suppliers of those products.
IBM may have patents or pending patent applications covering subject matter in this document. The furnishing of this document does not give you any license to these patents. Send
license inquires, in writing, to IBM Director of Licensing, IBM Corporation, New Castle Drive, Armonk, NY 10504-1785 USA.
All statements regarding IBM future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only.
The information contained in this document has not been submitted to any formal IBM test and is provided "AS IS" with no warranties or guarantees either expressed or implied.
All examples cited or described in this document are presented as illustrations of the manner in which some IBM products can be used and the results that may be achieved. Actual
environmental costs and performance characteristics will vary depending on individual client configurations and conditions.
IBM Global Financing offerings are provided through IBM Credit Corporation in the United States and other IBM subsidiaries and divisions worldwide to qualified commercial and government
clients. Rates are based on a client's credit rating, financing terms, offering type, equipment type and options, and may vary by country. Other restrictions may apply. Rates and offerings
are subject to change, extension or withdrawal without notice.
IBM is not responsible for printing errors in this document that result in pricing or information inaccuracies.
All prices shown are IBM's United States suggested list prices and are subject to change without notice; reseller prices may vary.
IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply.
Any performance data contained in this document was determined in a controlled environment. Actual results may vary significantly and are dependent on many factors including system
hardware configuration and software design and configuration. Some measurements quoted in this document may have been made on development-level systems. There is no guarantee
these measurements will be the same on generally-available systems. Some measurements quoted in this document may have been estimated through extrapolation. Users of this
document should verify the applicable data for their specific environment.

Contenu connexe

Tendances

Think Leadership March 2019 - How You Can Have A Piece Of The #1 Supercomputer
Think Leadership March 2019 - How You Can Have A Piece Of The #1 SupercomputerThink Leadership March 2019 - How You Can Have A Piece Of The #1 Supercomputer
Think Leadership March 2019 - How You Can Have A Piece Of The #1 SupercomputerAnand Haridass
 
POWER9 AC922 Newell System - HPC & AI
POWER9 AC922 Newell System - HPC & AI POWER9 AC922 Newell System - HPC & AI
POWER9 AC922 Newell System - HPC & AI Anand Haridass
 
HPC Infrastructure To Solve The CFD Grand Challenge
HPC Infrastructure To Solve The CFD Grand ChallengeHPC Infrastructure To Solve The CFD Grand Challenge
HPC Infrastructure To Solve The CFD Grand ChallengeAnand Haridass
 
Heterogeneous Computing : The Future of Systems
Heterogeneous Computing : The Future of SystemsHeterogeneous Computing : The Future of Systems
Heterogeneous Computing : The Future of SystemsAnand Haridass
 
IBM Cloud Paris Meetup - 20190520 - IA & Power
IBM Cloud Paris Meetup - 20190520 - IA & PowerIBM Cloud Paris Meetup - 20190520 - IA & Power
IBM Cloud Paris Meetup - 20190520 - IA & PowerIBM France Lab
 
EXTENT-2017: Heterogeneous Computing Trends and Business Value Creation
EXTENT-2017: Heterogeneous Computing Trends and Business Value CreationEXTENT-2017: Heterogeneous Computing Trends and Business Value Creation
EXTENT-2017: Heterogeneous Computing Trends and Business Value CreationIosif Itkin
 
How to apply the latest advances in hitachi mainframe storage webinar
How to apply the latest advances in hitachi mainframe storage webinarHow to apply the latest advances in hitachi mainframe storage webinar
How to apply the latest advances in hitachi mainframe storage webinarHitachi Vantara
 
Blue line Supermicro Server Building Block Solutions
Blue line Supermicro Server Building Block SolutionsBlue line Supermicro Server Building Block Solutions
Blue line Supermicro Server Building Block SolutionsBlue Line
 
G108277 ds8000-resiliency-lagos-v1905c
G108277 ds8000-resiliency-lagos-v1905cG108277 ds8000-resiliency-lagos-v1905c
G108277 ds8000-resiliency-lagos-v1905cTony Pearson
 
Introduction of Fujitsu's HPC Processor for the Post-K Computer
Introduction of Fujitsu's HPC Processor for the Post-K ComputerIntroduction of Fujitsu's HPC Processor for the Post-K Computer
Introduction of Fujitsu's HPC Processor for the Post-K Computerinside-BigData.com
 
Ibm power systems facts and features power 8
Ibm power systems facts and features  power 8 Ibm power systems facts and features  power 8
Ibm power systems facts and features power 8 Diego Alberto Tamayo
 
BladeCenter GPU Expansion Blade (BGE) - Client Presentation
BladeCenter GPU Expansion Blade (BGE) - Client PresentationBladeCenter GPU Expansion Blade (BGE) - Client Presentation
BladeCenter GPU Expansion Blade (BGE) - Client PresentationCliff Kinard
 
SCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
SCFE 2020 OpenCAPI presentation as part of OpenPWOER TutorialSCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
SCFE 2020 OpenCAPI presentation as part of OpenPWOER TutorialGanesan Narayanasamy
 
Large Model support and Distribute deep learning
Large Model support and Distribute deep learningLarge Model support and Distribute deep learning
Large Model support and Distribute deep learningGanesan Narayanasamy
 

Tendances (20)

Think Leadership March 2019 - How You Can Have A Piece Of The #1 Supercomputer
Think Leadership March 2019 - How You Can Have A Piece Of The #1 SupercomputerThink Leadership March 2019 - How You Can Have A Piece Of The #1 Supercomputer
Think Leadership March 2019 - How You Can Have A Piece Of The #1 Supercomputer
 
POWER9 AC922 Newell System - HPC & AI
POWER9 AC922 Newell System - HPC & AI POWER9 AC922 Newell System - HPC & AI
POWER9 AC922 Newell System - HPC & AI
 
Palestra IBM-Mack Zvm linux
Palestra  IBM-Mack Zvm linux  Palestra  IBM-Mack Zvm linux
Palestra IBM-Mack Zvm linux
 
HPC Infrastructure To Solve The CFD Grand Challenge
HPC Infrastructure To Solve The CFD Grand ChallengeHPC Infrastructure To Solve The CFD Grand Challenge
HPC Infrastructure To Solve The CFD Grand Challenge
 
Heterogeneous Computing : The Future of Systems
Heterogeneous Computing : The Future of SystemsHeterogeneous Computing : The Future of Systems
Heterogeneous Computing : The Future of Systems
 
IBM Cloud Paris Meetup - 20190520 - IA & Power
IBM Cloud Paris Meetup - 20190520 - IA & PowerIBM Cloud Paris Meetup - 20190520 - IA & Power
IBM Cloud Paris Meetup - 20190520 - IA & Power
 
Deeplearningusingcloudpakfordata
DeeplearningusingcloudpakfordataDeeplearningusingcloudpakfordata
Deeplearningusingcloudpakfordata
 
EXTENT-2017: Heterogeneous Computing Trends and Business Value Creation
EXTENT-2017: Heterogeneous Computing Trends and Business Value CreationEXTENT-2017: Heterogeneous Computing Trends and Business Value Creation
EXTENT-2017: Heterogeneous Computing Trends and Business Value Creation
 
How to apply the latest advances in hitachi mainframe storage webinar
How to apply the latest advances in hitachi mainframe storage webinarHow to apply the latest advances in hitachi mainframe storage webinar
How to apply the latest advances in hitachi mainframe storage webinar
 
IBM HPC Transformation with AI
IBM HPC Transformation with AI IBM HPC Transformation with AI
IBM HPC Transformation with AI
 
PowerAI Deep dive
PowerAI Deep divePowerAI Deep dive
PowerAI Deep dive
 
OpenPOWER Webinar
OpenPOWER Webinar OpenPOWER Webinar
OpenPOWER Webinar
 
Blue line Supermicro Server Building Block Solutions
Blue line Supermicro Server Building Block SolutionsBlue line Supermicro Server Building Block Solutions
Blue line Supermicro Server Building Block Solutions
 
G108277 ds8000-resiliency-lagos-v1905c
G108277 ds8000-resiliency-lagos-v1905cG108277 ds8000-resiliency-lagos-v1905c
G108277 ds8000-resiliency-lagos-v1905c
 
Introduction of Fujitsu's HPC Processor for the Post-K Computer
Introduction of Fujitsu's HPC Processor for the Post-K ComputerIntroduction of Fujitsu's HPC Processor for the Post-K Computer
Introduction of Fujitsu's HPC Processor for the Post-K Computer
 
Ibm power systems facts and features power 8
Ibm power systems facts and features  power 8 Ibm power systems facts and features  power 8
Ibm power systems facts and features power 8
 
BladeCenter GPU Expansion Blade (BGE) - Client Presentation
BladeCenter GPU Expansion Blade (BGE) - Client PresentationBladeCenter GPU Expansion Blade (BGE) - Client Presentation
BladeCenter GPU Expansion Blade (BGE) - Client Presentation
 
SCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
SCFE 2020 OpenCAPI presentation as part of OpenPWOER TutorialSCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
SCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
 
Large Model support and Distribute deep learning
Large Model support and Distribute deep learningLarge Model support and Distribute deep learning
Large Model support and Distribute deep learning
 
IBM AI at Scale
IBM AI at ScaleIBM AI at Scale
IBM AI at Scale
 

Similaire à Shared-Memory Heterogeneous Computing for HPC and AI

Multiple Shared Processor Pools In Power Systems
Multiple Shared Processor Pools In Power SystemsMultiple Shared Processor Pools In Power Systems
Multiple Shared Processor Pools In Power SystemsAndrey Klyachkin
 
Power 7 Overview
Power 7 OverviewPower 7 Overview
Power 7 Overviewlambertt
 
Capi snap overview
Capi snap overviewCapi snap overview
Capi snap overviewYutaka Kawai
 
Intro to Cell Broadband Engine for HPC
Intro to Cell Broadband Engine for HPCIntro to Cell Broadband Engine for HPC
Intro to Cell Broadband Engine for HPCSlide_N
 
Enterprise power systems transition to power7 technology
Enterprise power systems transition to power7 technologyEnterprise power systems transition to power7 technology
Enterprise power systems transition to power7 technologysolarisyougood
 
Hortonworks on IBM POWER Analytics / AI
Hortonworks on IBM POWER Analytics / AIHortonworks on IBM POWER Analytics / AI
Hortonworks on IBM POWER Analytics / AIDataWorks Summit
 
Deploying Massive Scale Graphs for Realtime Insights
Deploying Massive Scale Graphs for Realtime InsightsDeploying Massive Scale Graphs for Realtime Insights
Deploying Massive Scale Graphs for Realtime InsightsNeo4j
 
Transparent Hardware Acceleration for Deep Learning
Transparent Hardware Acceleration for Deep LearningTransparent Hardware Acceleration for Deep Learning
Transparent Hardware Acceleration for Deep LearningIndrajit Poddar
 
#IBMEdge: Flash Storage Session
#IBMEdge: Flash Storage Session#IBMEdge: Flash Storage Session
#IBMEdge: Flash Storage SessionBrocade
 
Ibm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bkIbm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bkIBM Switzerland
 
Pulse 2011 virtualization and storwize v7000
Pulse 2011 virtualization and storwize v7000Pulse 2011 virtualization and storwize v7000
Pulse 2011 virtualization and storwize v7000Anthony Vandewerdt
 
08 Supercomputer Fugaku
08 Supercomputer Fugaku08 Supercomputer Fugaku
08 Supercomputer FugakuRCCSRENKEI
 
IBM Power Systems - enabling cloud solutions
IBM Power Systems - enabling cloud solutionsIBM Power Systems - enabling cloud solutions
IBM Power Systems - enabling cloud solutionsDavid Spurway
 
SDC20 ScaleFlux.pptx
SDC20 ScaleFlux.pptxSDC20 ScaleFlux.pptx
SDC20 ScaleFlux.pptxssuserabc741
 
April 2014 IBM announcement webcast
April 2014 IBM announcement webcastApril 2014 IBM announcement webcast
April 2014 IBM announcement webcastHELP400
 
AI in Health Care using IBM Systems/OpenPOWER systems
AI in Health Care using IBM Systems/OpenPOWER systemsAI in Health Care using IBM Systems/OpenPOWER systems
AI in Health Care using IBM Systems/OpenPOWER systemsGanesan Narayanasamy
 
AI in Healh Care using IBM POWER systems
AI in Healh Care using IBM POWER systems AI in Healh Care using IBM POWER systems
AI in Healh Care using IBM POWER systems Ganesan Narayanasamy
 
Fujitsu m10 server features and capabilities
Fujitsu m10 server features and capabilitiesFujitsu m10 server features and capabilities
Fujitsu m10 server features and capabilitiessolarisyougood
 

Similaire à Shared-Memory Heterogeneous Computing for HPC and AI (20)

OpenPOWER Seminar at IIT Madras
OpenPOWER Seminar at IIT MadrasOpenPOWER Seminar at IIT Madras
OpenPOWER Seminar at IIT Madras
 
OpenPOWER Update
OpenPOWER UpdateOpenPOWER Update
OpenPOWER Update
 
Multiple Shared Processor Pools In Power Systems
Multiple Shared Processor Pools In Power SystemsMultiple Shared Processor Pools In Power Systems
Multiple Shared Processor Pools In Power Systems
 
Power 7 Overview
Power 7 OverviewPower 7 Overview
Power 7 Overview
 
Capi snap overview
Capi snap overviewCapi snap overview
Capi snap overview
 
Intro to Cell Broadband Engine for HPC
Intro to Cell Broadband Engine for HPCIntro to Cell Broadband Engine for HPC
Intro to Cell Broadband Engine for HPC
 
Enterprise power systems transition to power7 technology
Enterprise power systems transition to power7 technologyEnterprise power systems transition to power7 technology
Enterprise power systems transition to power7 technology
 
Hortonworks on IBM POWER Analytics / AI
Hortonworks on IBM POWER Analytics / AIHortonworks on IBM POWER Analytics / AI
Hortonworks on IBM POWER Analytics / AI
 
Deploying Massive Scale Graphs for Realtime Insights
Deploying Massive Scale Graphs for Realtime InsightsDeploying Massive Scale Graphs for Realtime Insights
Deploying Massive Scale Graphs for Realtime Insights
 
Transparent Hardware Acceleration for Deep Learning
Transparent Hardware Acceleration for Deep LearningTransparent Hardware Acceleration for Deep Learning
Transparent Hardware Acceleration for Deep Learning
 
#IBMEdge: Flash Storage Session
#IBMEdge: Flash Storage Session#IBMEdge: Flash Storage Session
#IBMEdge: Flash Storage Session
 
Ibm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bkIbm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bk
 
Pulse 2011 virtualization and storwize v7000
Pulse 2011 virtualization and storwize v7000Pulse 2011 virtualization and storwize v7000
Pulse 2011 virtualization and storwize v7000
 
08 Supercomputer Fugaku
08 Supercomputer Fugaku08 Supercomputer Fugaku
08 Supercomputer Fugaku
 
IBM Power Systems - enabling cloud solutions
IBM Power Systems - enabling cloud solutionsIBM Power Systems - enabling cloud solutions
IBM Power Systems - enabling cloud solutions
 
SDC20 ScaleFlux.pptx
SDC20 ScaleFlux.pptxSDC20 ScaleFlux.pptx
SDC20 ScaleFlux.pptx
 
April 2014 IBM announcement webcast
April 2014 IBM announcement webcastApril 2014 IBM announcement webcast
April 2014 IBM announcement webcast
 
AI in Health Care using IBM Systems/OpenPOWER systems
AI in Health Care using IBM Systems/OpenPOWER systemsAI in Health Care using IBM Systems/OpenPOWER systems
AI in Health Care using IBM Systems/OpenPOWER systems
 
AI in Healh Care using IBM POWER systems
AI in Healh Care using IBM POWER systems AI in Healh Care using IBM POWER systems
AI in Healh Care using IBM POWER systems
 
Fujitsu m10 server features and capabilities
Fujitsu m10 server features and capabilitiesFujitsu m10 server features and capabilities
Fujitsu m10 server features and capabilities
 

Plus de Ganesan Narayanasamy

Chip Design Curriculum development Residency program
Chip Design Curriculum development Residency programChip Design Curriculum development Residency program
Chip Design Curriculum development Residency programGanesan Narayanasamy
 
Basics of Digital Design and Verilog
Basics of Digital Design and VerilogBasics of Digital Design and Verilog
Basics of Digital Design and VerilogGanesan Narayanasamy
 
180 nm Tape out experience using Open POWER ISA
180 nm Tape out experience using Open POWER ISA180 nm Tape out experience using Open POWER ISA
180 nm Tape out experience using Open POWER ISAGanesan Narayanasamy
 
Workload Transformation and Innovations in POWER Architecture
Workload Transformation and Innovations in POWER Architecture Workload Transformation and Innovations in POWER Architecture
Workload Transformation and Innovations in POWER Architecture Ganesan Narayanasamy
 
Deep Learning Use Cases using OpenPOWER systems
Deep Learning Use Cases using OpenPOWER systemsDeep Learning Use Cases using OpenPOWER systems
Deep Learning Use Cases using OpenPOWER systemsGanesan Narayanasamy
 
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...Ganesan Narayanasamy
 
AI in healthcare and Automobile Industry using OpenPOWER/IBM POWER9 systems
AI in healthcare and Automobile Industry using OpenPOWER/IBM POWER9 systemsAI in healthcare and Automobile Industry using OpenPOWER/IBM POWER9 systems
AI in healthcare and Automobile Industry using OpenPOWER/IBM POWER9 systemsGanesan Narayanasamy
 
Graphical Structure Learning accelerated with POWER9
Graphical Structure Learning accelerated with POWER9Graphical Structure Learning accelerated with POWER9
Graphical Structure Learning accelerated with POWER9Ganesan Narayanasamy
 
OpenPOWER Foundation Introduction
OpenPOWER Foundation Introduction OpenPOWER Foundation Introduction
OpenPOWER Foundation Introduction Ganesan Narayanasamy
 

Plus de Ganesan Narayanasamy (20)

Chip Design Curriculum development Residency program
Chip Design Curriculum development Residency programChip Design Curriculum development Residency program
Chip Design Curriculum development Residency program
 
Basics of Digital Design and Verilog
Basics of Digital Design and VerilogBasics of Digital Design and Verilog
Basics of Digital Design and Verilog
 
180 nm Tape out experience using Open POWER ISA
180 nm Tape out experience using Open POWER ISA180 nm Tape out experience using Open POWER ISA
180 nm Tape out experience using Open POWER ISA
 
Workload Transformation and Innovations in POWER Architecture
Workload Transformation and Innovations in POWER Architecture Workload Transformation and Innovations in POWER Architecture
Workload Transformation and Innovations in POWER Architecture
 
OpenPOWER Workshop at IIT Roorkee
OpenPOWER Workshop at IIT RoorkeeOpenPOWER Workshop at IIT Roorkee
OpenPOWER Workshop at IIT Roorkee
 
Deep Learning Use Cases using OpenPOWER systems
Deep Learning Use Cases using OpenPOWER systemsDeep Learning Use Cases using OpenPOWER systems
Deep Learning Use Cases using OpenPOWER systems
 
IBM BOA for POWER
IBM BOA for POWER IBM BOA for POWER
IBM BOA for POWER
 
OpenPOWER System Marconi100
OpenPOWER System Marconi100OpenPOWER System Marconi100
OpenPOWER System Marconi100
 
OpenPOWER Latest Updates
OpenPOWER Latest UpdatesOpenPOWER Latest Updates
OpenPOWER Latest Updates
 
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
 
AI in healthcare and Automobile Industry using OpenPOWER/IBM POWER9 systems
AI in healthcare and Automobile Industry using OpenPOWER/IBM POWER9 systemsAI in healthcare and Automobile Industry using OpenPOWER/IBM POWER9 systems
AI in healthcare and Automobile Industry using OpenPOWER/IBM POWER9 systems
 
AI in healthcare - Use Cases
AI in healthcare - Use Cases AI in healthcare - Use Cases
AI in healthcare - Use Cases
 
Poster from NUS
Poster from NUSPoster from NUS
Poster from NUS
 
SAP HANA on POWER9 systems
SAP HANA on POWER9 systemsSAP HANA on POWER9 systems
SAP HANA on POWER9 systems
 
Graphical Structure Learning accelerated with POWER9
Graphical Structure Learning accelerated with POWER9Graphical Structure Learning accelerated with POWER9
Graphical Structure Learning accelerated with POWER9
 
AI in the enterprise
AI in the enterprise AI in the enterprise
AI in the enterprise
 
Robustness in deep learning
Robustness in deep learningRobustness in deep learning
Robustness in deep learning
 
Perspectives of Frond end Design
Perspectives of Frond end DesignPerspectives of Frond end Design
Perspectives of Frond end Design
 
A2O Core implementation on FPGA
A2O Core implementation on FPGAA2O Core implementation on FPGA
A2O Core implementation on FPGA
 
OpenPOWER Foundation Introduction
OpenPOWER Foundation Introduction OpenPOWER Foundation Introduction
OpenPOWER Foundation Introduction
 

Dernier

Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 

Dernier (20)

Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 

Shared-Memory Heterogeneous Computing for HPC and AI

  • 1. Shared-Memory Heterogeneous Computing for HPC and AI H. Peter Hofstee, Ph.D. IBM ( & TU Delft ) IIIT-B May 3 2019
  • 2. 2
  • 3. © 2006 IBM Corporation
  • 4. © 2006 IBM Corporation4 Cell Architecture is … COHERENT BUS Power ISA MMU/BIU Power ISA MMU/BIU … IO transl. Memory Incl. coherence/memory compatible with 32/64b Power Arch. Applications and OS’s 64b Power Architecture™
  • 5. © 2006 IBM Corporation5 Cell Architecture is … 64b Power Architecture™ COHERENT BUS Power ISA +RMT MMU/BIU +RMT Power ISA +RMT MMU/BIU +RMT IO transl. Memory Plus Memory Flow Control (MFC) MMU/DMA +RMT Local Store Memory MMU/DMA +RMT Local Store Memory LS Alias LS Alias … … …
  • 6. © 2006 IBM Corporation6 Cell Architecture is … 64b Power Architecture™+ MFC COHERENT BUS Power ISA +RMT MMU/BIU +RMT Power ISA +RMT MMU/BIU +RMT IO transl. Memory Plus Synergistic Processors MMU/DMA +RMT Local Store Memory MMU/DMA +RMT Local Store Memory LS Alias LS Alias … … … Syn. Proc. ISA Syn. Proc. ISA
  • 7. © 2006 IBM Corporation
  • 8. © 2006 IBM Corporation8 Cell BE based Systems: SCEI, IBM, Mercury, …
  • 9. © 2006 IBM Corporation9 Advanced Cell Blade Prototype QS20 Advanced Cell-BE Based Blade Prototype
  • 10. © 2006 IBM Corporation
  • 11. © 2006 IBM Corporation
  • 12. © 2006 IBM Corporation
  • 13. © 2006 IBM Corporation
  • 14. © 2006 IBM Corporation Next Step: OpenMP 4.0 and FPGAs
  • 15.
  • 16.
  • 17.
  • 18. Fundamental forces are accelerating change in our industryPrice/Performance Full system stack innovation required Moore’s Law IT innovation can no longer come from just the processor Cognitive Custom Hyperscale Data Centers Hybrid Cloud Open Solutions IT consumption models are expanding Technology and Processors 2000 2020 Firmware / OS Accelerators Software Storage NetworkFull Stack Acceleration (Lower is better) 18
  • 19. POWER8 Architecture POWER9 Architecture 2014 POWER8 12 cores 22nm New Micro- Architecture New Process Technology 2016 POWER8 w/ NVLink 12 cores 22nm Enhanced Micro- Architecture With NVLink 2017 P9 SO 24 cores 14nm New Micro- Architecture Direct attach memory New Process Technology 2018 P9 SU 24 cores 14nm Enhanced Micro- Architecture Buffered Memory POWER7 Architecture 2010 POWER7 8 cores 45nm New Micro- Architecture New Process Technology 2012 POWER7+ 8 cores 32nm Enhanced Micro- Architecture New Process Technology 2020+ P10 TBD cores New Micro- Architecture New Technology POWER10 2019 P9 w/ Adv. I/O 24 cores 14nm Enhanced Micro- Architecture New Memory Subsystem Up To 150 GB/s PCIe Gen4 x48 192GB/s 25 GT/s 300GB/s CAPI 2.0, OpenCAPI3.0, NVLink2.0 Sustained Memory Bandwidth Standard I/O Interconnect Advanced I/O Signaling Advanced I/O Architecture Up To 210 GB/s PCIe Gen4 x48 25 GT/s 300GB/s CAPI 2.0, OpenCAPI3.0, NVLink2.0 Up To 350 GB/s PCIe Gen4 x48 25 GT/s 300GB/s CAPI 2.0, OpenCAPI4.0, NVLink3.0 Up To 435 GB/s PCIe Gen5 32 & 50 GT/s TBD Up To 210 GB/s PCIe Gen3 N/A CAPI 1.0 Up To 210 GB/s PCIe Gen3 20 GT/s 160GB/s CAPI 1.0 , NVLink 1.0 Up To 65 GB/s PCIe Gen2 N/A N/A Up To 65 GB/s PCIe Gen2 N/A N/A Statement of Direction, Subject to Change 19
  • 20. 20 J. Stuecheli, IBM: OpenPOWER Summit Europe 2018
  • 21. 21 Wistron Power9 MiHawkPower9 Zaius/Barreleye G2 1Tb/s (10x 100Gb/s) demo!
  • 22. 22 WISTRON “MiHawk” 24 x NVMe = 96 lanes Gen3 PCIe = 48 lanes Gen4 PCIe = 32 lanes OpenCAPI 3.0 Image Source: Wistron OpenCAPI !
  • 23. 23 400Gb/s (x2) ~400Gb/s (x2) 400Gb/s (x2) ~400Gb/s (x2) POWER9 POWER9 500Gb/s (x2) 400Gb/s (x2) 400Gb/s (x2) 1-2TB DDR4 Memory 1-2TB DDR4 Memory >1000Gb/s >1000Gb/s 800Gb/s (x2) Option VU37p (400+GB/s HBM) VU37p (400+GB/s HBM)
  • 24. 24 Source: AlphaData AlphaData ‘9H7 and ‘9V3 with OpenCAPI ! Source: IBM
  • 26. 26 Apache Spark JVM Memory (off-heap) Serialize / Deserialize Network Disk A ccelerat or Native library A pplicat ion Python libary Apache Spark JVM Shared data set / memory in Arrow format NetworkStorage FPGA A ccelerat or Native library A pplicat ion Python tool A pache A rrow libraries Flet cher J. Peltenburg, e.a., TU Delft ( OpenPOWER Summit USA 2018 )
  • 27. CAPI ... ...... Host AXI Interconnect 2:1 AWS F1 Shell PCIe DDR Controller DDR SDRAM (on board) Fletcher Interconnect (N:1) Column Reader AXI4AXI4 Lite Bus master side General dataflow MMIO DMA 0 R-1 Accumulators Regexunit0 Column Reader 0 R-1 RegexunitN-1 Regex matcher R no. matches Layer A Layer B AW S EC2 F1 Host Memory CAPP SNAP POW ER8 CA PI AXI4 Lite MMIO AXI4 PSL PCIe Off-board components Non-FPGA components 27 R=16 different regular expressions per unit AWS EC2 F1: • Virtex Ultrascale+ • N=16 regex units • 256 regexes being matched in parallel POWER8 CAPI (Supervessel, & soon at Nimbix): • AlphaData KU3 (Kintex Ultrascale) • N=8 regex units • 128 regex being matched in parallel 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 log2(Bytes) 0 1 2 3 GB/s AWS EC2 F1 (16 units) POWER8+CAPI (8 units)
  • 28. 28
  • 29. 29 A closer look at Summit & Sierra #1 & #2 in HPC … and > 3 ExaOp AI !
  • 30. Weather GRAF video or still image
  • 31. 31 IBM Global High-Resolution Athmospheric Forecasting System
  • 32. 32 © 2018 IBM Corporation 5x Faster Data Communication with Unique CPU-GPU NVLink High-Speed Connection 1TB Memory Power 9 CPU V100 GPU V100 GPU 170GB/s NVLink 150 GB/s 1TB Memory Power 9 CPU V100 GPU V100 GPU 170GB/s NVLink 150 GB/s IBM AC922 Power System Deep Learning Server (4-GPU Config) Store Large Models in System Memory Operate on One Layer at a Time Fast Transfer via NVLink 64 GB/s(x2) NVLink 150 GB/s NVLink 150 GB/s NVLink 150 GB/s NVLink 150 GB/s
  • 37. 37 Source: L. Grinberg, OpenPOWER Summit Europe
  • 38. 38 Source: L. Grinberg, OpenPOWER Summit Europe
  • 39. IBM Open Source Based AI Stack 39 Accelerated AC922 Power9 Servers Storage (Spectrum Scale ESS) Watson Studio SnapML WML CE Runtime Environment Train, Deploy, Manage Models Watson OpenScale Model Metrics, Bias, and Fairness Monitoring Watson Machine Learning Watson ML CE Watson ML Accelerator Data Preparation Model Development Environment Auto-AI software: PowerAI Vision, IBM Auto-AI Previous Names: WML Accelerator = PowerAI Enterprise WML Community Ed. = PowerAI-base Runs on x86 & other storage too
  • 40. 40 © 2018 IBM Corporation 3.1 Hours 49 Mins 0 2000 4000 6000 8000 10000 12000 Xeon x862640v4w/ 4x V100 GPUs Power AC922 w/ 4x V100 GPUs Time(secs) Caffe with LMS (Large Model Support) Runtime of 1000 Iterations 3.8x Faster GoogleNet model on Enlarged ImageNet Dataset (2240x2240)
  • 41. 41 Source IBM Research, Zurich 3D U-Net on Tensorflow with Large Memory Support
  • 42. +1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 +1 -1 -1 -1 -1 -1 -1 +1 -1 4.2 billion examples 1 million features Goal: Predict whether a user will click on a given advert based on an anonymized set of features. Train: Fit model parameters using 4.2 billion examples. Inference: Evaluate model on 180 million unseen examples. +1 – click -1 – no click Sparse data matrix 2.3TB labels Criteo Labs. 2015. Criteo Releases Industry s Largest-Ever Dataset for Machine Learning to Academic Community. h ps://www.criteo.com/news/press-releases/2015/07/criteo-releases-industrys-largest-ever-dataset/ * *
  • 44. 44
  • 45. POWER8 Architecture POWER9 Architecture 2014 POWER8 12 cores 22nm New Micro- Architecture New Process Technology 2016 POWER8 w/ NVLink 12 cores 22nm Enhanced Micro- Architecture With NVLink 2017 P9 SO 24 cores 14nm New Micro- Architecture Direct attach memory New Process Technology 2018 P9 SU 24 cores 14nm Enhanced Micro- Architecture Buffered Memory POWER7 Architecture 2010 POWER7 8 cores 45nm New Micro- Architecture New Process Technology 2012 POWER7+ 8 cores 32nm Enhanced Micro- Architecture New Process Technology 2020+ P10 TBD cores New Micro- Architecture New Technology POWER10 2019 P9 w/ Adv. I/O 24 cores 14nm Enhanced Micro- Architecture New Memory Subsystem Up To 150 GB/s PCIe Gen4 x48 25 GT/s 300GB/s CAPI 2.0, OpenCAPI3.0, NVLink2.0 Sustained Memory Bandwidth Standard I/O Interconnect Advanced I/O Signaling Advanced I/O Architecture Up To 210 GB/s PCIe Gen4 x48 25 GT/s 300GB/s CAPI 2.0, OpenCAPI3.0, NVLink2.0 Up To 350 GB/s PCIe Gen4 x48 25 GT/s 300GB/s CAPI 2.0, OpenCAPI4.0, NVLink3.0 Up To 435 GB/s PCIe Gen5 32 & 50 GT/s TBD Up To 210 GB/s PCIe Gen3 N/A CAPI 1.0 Up To 210 GB/s PCIe Gen3 20 GT/s 160GB/s CAPI 1.0 , NVLink 1.0 Up To 65 GB/s PCIe Gen2 N/A N/A Up To 65 GB/s PCIe Gen2 N/A N/A Statement of Direction, Subject to Change 45
  • 46. 46 J. Stuecheli, IBM: OpenPOWER Summit Europe 2018
  • 47. 47
  • 48. 48
  • 49. 49
  • 50. 50
  • 51. 51
  • 52. 52
  • 53. 53
  • 54. Copyright © 2019 by International Business Machines Corporation. All rights reserved. No part of this document may be reproduced or transmitted in any form without written permission from IBM Corporation. Product data has been reviewed for accuracy as of the date of initial publication. Product data is subject to change without notice. This document could include technical inaccuracies or typographical errors. IBM may make improvements and/or changes in the product(s) and/or program(s) described herein at any time without notice. Any statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only. References in this document to IBM products, programs, or services does not imply that IBM intends to make such products, programs or services available in all countries in which IBM operates or does business. Any reference to an IBM Program Product in this document is not intended to state or imply that only that program product may be used. Any functionally equivalent program, that does not infringe IBM's intellectually property rights, may be used instead. THE INFORMATION PROVIDED IN THIS DOCUMENT IS DISTRIBUTED "AS IS" WITHOUT ANY WARRANTY, EITHER OR IMPLIED. IBM LY DISCLAIMS ANY WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR NONINFRINGEMENT. IBM shall have no responsibility to update this information. IBM products are warranted, if at all, according to the terms and conditions of the agreements (e.g., IBM Customer Agreement, Statement of Limited Warranty, International Program License Agreement, etc.) under which they are provided. Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products in connection with this publication and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. IBM makes no representations or warranties, ed or implied, regarding non-IBM products and services. The provision of the information contained herein is not intended to, and does not, grant any right or license under any IBM patents or copyrights. Inquiries regarding patent or copyright licenses should be made, in writing, to: IBM Director of Licensing IBM Corporation North Castle Drive Armonk, NY 1 0504- 785 U.S.A. | 54
  • 55. | 55 IBM, the IBM logo, ibm.com, IBM System Storage, IBM Spectrum Storage, IBM Spectrum Control, IBM Spectrum Protect, IBM Spectrum Archive, IBM Spectrum Virtualize, IBM Spectrum Scale, IBM Spectrum Accelerate, Softlayer, and XIV are trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at http://www.ibm.com/legal/copytrade.shtml The following are trademarks or registered trademarks of other companies. Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries. IT Infrastructure Library is a Registered Trade Mark of AXELOS Limited. Linear Tape-Open, LTO, the LTO Logo, Ultrium, and the Ultrium logo are trademarks of HP, IBM Corp. and Quantum in the U.S. and other countries. Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. Java and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or its affiliates. Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc. in the United States, other countries, or both and is used under license therefrom. ITIL is a Registered Trade Mark of AXELOS Limited. UNIX is a registered trademark of The Open Group in the United States and other countries. * All other products may be trademarks or registered trademarks of their respective companies. Notes: Performance is in Internal Throughput Rate (ITR) ratio based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput that any user will experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput improvements equivalent to the performance ratios stated here. All customer examples cited or described in this presentation are presented as illustrations of the manner in which some customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics will vary depending on individual customer configurations and conditions. This publication was produced in the United States. IBM may not offer the products, services or features discussed in this document in other countries, and the information may be subject to change without notice. Consult your local IBM business contact for information on the product or services available in your area. All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only. Information about non-IBM products is obtained from the manufacturers of those products or their published announcements. IBM has not tested those products and cannot confirm the performance, compatibility, or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. Prices subject to change without notice. Contact your IBM representative or Business Partner for the most current pricing in your geography. This presentation and the claims outlined in it were reviewed for compliance with US law. Adaptations of these claims for use in other geographies must be reviewed by the local country counsel for compliance with local laws.
  • 56. | 56 This document was developed for IBM offerings in the United States as of the date of publication. IBM may not make these offerings available in other countries, and the information is subject to change without notice. Consult your local IBM business contact for information on the IBM offerings available in your area. Information in this document concerning non-IBM products was obtained from the suppliers of these products or other public sources. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. IBM may have patents or pending patent applications covering subject matter in this document. The furnishing of this document does not give you any license to these patents. Send license inquires, in writing, to IBM Director of Licensing, IBM Corporation, New Castle Drive, Armonk, NY 10504-1785 USA. All statements regarding IBM future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only. The information contained in this document has not been submitted to any formal IBM test and is provided "AS IS" with no warranties or guarantees either expressed or implied. All examples cited or described in this document are presented as illustrations of the manner in which some IBM products can be used and the results that may be achieved. Actual environmental costs and performance characteristics will vary depending on individual client configurations and conditions. IBM Global Financing offerings are provided through IBM Credit Corporation in the United States and other IBM subsidiaries and divisions worldwide to qualified commercial and government clients. Rates are based on a client's credit rating, financing terms, offering type, equipment type and options, and may vary by country. Other restrictions may apply. Rates and offerings are subject to change, extension or withdrawal without notice. IBM is not responsible for printing errors in this document that result in pricing or information inaccuracies. All prices shown are IBM's United States suggested list prices and are subject to change without notice; reseller prices may vary. IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply. Any performance data contained in this document was determined in a controlled environment. Actual results may vary significantly and are dependent on many factors including system hardware configuration and software design and configuration. Some measurements quoted in this document may have been made on development-level systems. There is no guarantee these measurements will be the same on generally-available systems. Some measurements quoted in this document may have been estimated through extrapolation. Users of this document should verify the applicable data for their specific environment.