Japan's post K Computer

inside-BigData.com
inside-BigData.comPresident of insideHPC Media à inside-BigData.com
Japan’s post K Computer
Yutaka Ishikawa
Project Leader
RIKEN AICS
HPC User Forum, 7th September, 2016
Outline of Talk
 Introduction of FLAGSHIP2020 project
 An Overview of post K system
 Concluding Remarks
2016/09/07 2HP User Forum @ Austin
An Overview of Flagship 2020 project
 Developing the next Japanese flagship
computer, so-called “post K”
 Developing a wide range of application
codes, to run on the “post K”, to solve
major social and science issues
Vendor partner
The Japanese government selected 9
social & scientific priority issues and
their R&D organizations.
3
Co-design
4
Target ApplicationsArchitectural Parameters
• #SIMD, SIMD length, #core, #NUMA node
• cache (size and bandwidth)
• memory technologies
• specialized hardware
• Interconnect
• I/O network
Target Applications’ Characteristics
5
Target Application
Program Brief description Co-design
① GENESIS MD for proteins Collective comm. (all-to-all), Floating point perf (FPP)
② Genomon Genome processing (Genome alignment) File I/O, Integer Perf.
③ GAMERA
Earthquake simulator (FEM in unstructured
& structured grid)
Comm., Memory bandwidth
④ NICAM+LETK
Weather prediction system using Big data
(structured grid stencil & ensemble Kalman
filter)
Comm., Memory bandwidth, File I/O, SIMD
⑤ NTChem molecular electronic (structure calculation) Collective comm. (all-to-all, allreduce), FPP, SIMD,
⑥ FFB Large Eddy Simulation (unstructured grid) Comm., Memory bandwidth,
⑦ RSDFT
an ab-initio program (density functional
theory)
Collective comm. (bcast), FFP
⑧ Adventure
Computational Mechanics System for Large
Scale Analysis and Design (unstructured
grid)
Comm., Memory bandwidth, SIMD
⑨ CCS-QCD
Lattice QCD simulation (structured grid
Monte Carlo)
Comm., Memory bandwidth, Collective comm. (allreduce)
Co-design
6
Architectural Parameters
• #SIMD, SIMD length, #core, #NUMA node
• cache (size and bandwidth)
• memory technologies
• specialized hardware
• Interconnect
• I/O network
Target Applications
 Mutual understanding both
computer architecture/system software and applications
 Looking at performance predictions
 Finding out the best solution with constraints, e.g., power
consumption, budget, and space
Prediction of node-level
performance
Profiling applications,
e.g., cache misses and
execution unit usages
Prediction Tool
Prediction of scalability
(Communication cost)
R&D Organization
7
• DOE-MEXT
• JLESC
• …
International
Collaboration
• HPCI Consortium
• PC Cluster Consortium
• OpenHPC
• …
Communities
• Univ. of Tsukuba
• Univ. of Tokyo
• Univ. of Kyoto
Domestic
Collaboration
Outline of Talk
 Introduction of FLAGSHIP2020 project
 An Overview of post K system
 Concluding Remarks
2016/09/07 8HP User Forum @ Austin
An Overview of post K
 Hardware
 Manycore architecture
 6D mesh/torus Interconnect
 3-level hierarchical storage system
 Silicon Disk
 Magnetic Disk
 Storage for archive
Login
Servers
Maintenance
Servers
I/O Network
……
…
…
…
…
…
…
…
…
…
… Hierarchical
Storage System
Portal
Servers
 System Software
 Multi-Kernel: Linux with Light-weight Kernel
 File I/O middleware for 3-level hierarchical storage
system and application
 Application-oriented file I/O middleware
 MPI+OpenMP programming environment
 Highly productive programing language and libraries
2016/09/07 9HP User Forum @ Austin
What we have done
 Software
 OS functional design
 Communication functional
design
 File I/O functional design
 Programming languages
 Mathematical libraries
• Node architecture
• System configuration
• Storage system
Continue to design
 Hardware
 Instruction set architecture
2016/09/07 10HP User Forum @ Austin
Instruction Set Architecture
 ARM V8 HPC Extension
 Fujitsu is a lead partner of ARM HPC extension development
 Detailed features were announced at Hot Chips 28 - 2016
http://www.hotchips.org/program/
Mon 8/22 Day1 9:45AM GPUs & HPCs
ARMv8-A Next Generation Vector Architecture for
HPC
 Fujitsu’s inheritances
 FMA
 Math acceleration primitives
 Inter core barrier
 Sector cache
 Hardware prefetch assist
SVE (Scalable Vector Extension)
2016/09/07 11HP User Forum @ Austin
OS Kernel
 Requirements of OS Kernel targeting high-end HPC
 Noiseless execution environment for bulk-synchronous applications
 Ability to easily adapt to new/future system architectures
 E.g.: manycore CPUs, heterogenous core architectures, deep memory
hierarchy, etc.
- New process/thread management, memory management, …
 Ability to adapt to new/future application demand
 Big-Data, in-situ applications
- Support data flow from Internet devices to compute nodes
- Optimize data movement
Approach Pros. Cons.
Full-Weight
Kernel
(FWK)
e.g. Linux
Disabling, removing, tuning,
reimplementation, and adding
new features
Large community support results in
rapid new hardware adaptation
• Hard to implement a new feature if the
original mechanism is conflicted with
the new feature
• Hard to follow the latest kernel
distribution due to local large
modifications
Light-Weight
Kernel
(LWK)
Implementation from scratch
and adding new features
Easy to extend it because of small in
terms of logic and code size
• Applications, running on FWK, cannot
run always in LWK
• Small community maintenance limits
rapid growth
• Lack of device drivers
12
Our Approach:
Linux with Light-Weight Kernel
 Partition resources (CPU cores,
memory)
 Full Linux kernel on some cores
 System daemons and in-situ non
HPC applications
 Device drivers
 Light-weight kernel(LWK),
McKernel on other cores
 HPC applications
McKernel developed at RIKEN
 McKernel is loadable module of Linux
 McKernel supports Linux API
 McKernel runs on
 Intel Xeon and Xeon phi
 Fujitsu FX10
13
Very simple
memory
management
Thin LWK
Process/Thread
management
General
scheduler
Complex
Mem. Mngt.
Linux
TCP stack
Dev. Drivers
VFS
File Sys
Driers
Memory
… …
Interrupt
System
daemons
?
HPC Applications
PartitionPartition
In-situ non HPC application
Linux API (glibc, /sys/, /proc/)
Core Core Core Core Core Core
McKernel is deployed to the Oakforest-PACS
supercomputer, 25 PF in peak, at JCAHPC
organized by U. of Tsukuba and U. of Tokyo
2016/09/07 HP User Forum @ Austin
OS: McKernel
14
• Linux Kernel+Loadable LWK, McKernel
– Linux Kernel is resident, and daemons for job scheduler and etc. run on Linux
– McKernel is dynamically reloaded (rebooted) for each application
Finish
App A, requiring
LWK-without-
scheduler, Is invoked
App B, requiring
LWK-with-scheduler,
Is invoked
Finish
AppC,usingfullLinux
capability,Isinvoked
Finish
2016/09/07 HP User Forum @ Austin
OS: McKernel
15
https://asc.llnl.gov/sequoia/benchmarks
Linux with isolcpus McKernel
Results of FWQ (Fixed Work Quanta)
2016/09/07 HP User Forum @ Austin
Concluding Remarks
16
 Post K’s CPU is based on ARM V8 with HPC extension
 The usability will be improved than the K computer by changing
architecture
 More wide-range community support
 The system software stack for Post K is being designed and implemented
with the leverage of international collaborations
 The software stack developed at RIKEN is Open source
 It also runs on Intel Xeon and Xeon phi
 RIKEN would like to contribute to OpenHPC
2016/09/07 HP User Forum @ Austin
1 sur 16

Recommandé

Phytium 64 core cpu preview par
Phytium 64 core cpu previewPhytium 64 core cpu preview
Phytium 64 core cpu previewinside-BigData.com
3K vues23 diapositives
Introduction of Fujitsu's HPC Processor for the Post-K Computer par
Introduction of Fujitsu's HPC Processor for the Post-K ComputerIntroduction of Fujitsu's HPC Processor for the Post-K Computer
Introduction of Fujitsu's HPC Processor for the Post-K Computerinside-BigData.com
1.8K vues5 diapositives
ARM-based Supercomputer from Fujitsu and RIKEN - "Post-K" par
ARM-based Supercomputer from Fujitsu and RIKEN - "Post-K"ARM-based Supercomputer from Fujitsu and RIKEN - "Post-K"
ARM-based Supercomputer from Fujitsu and RIKEN - "Post-K"Phil Hughes
543 vues15 diapositives
SGI: Meeting Manufacturing's Need for Production Supercomputing par
SGI: Meeting Manufacturing's Need for Production SupercomputingSGI: Meeting Manufacturing's Need for Production Supercomputing
SGI: Meeting Manufacturing's Need for Production Supercomputinginside-BigData.com
784 vues14 diapositives
EMC in HPC – The Journey so far and the Road Ahead par
EMC in HPC – The Journey so far and the Road AheadEMC in HPC – The Journey so far and the Road Ahead
EMC in HPC – The Journey so far and the Road Aheadinside-BigData.com
689 vues9 diapositives
High Performance Interconnects: Assessment & Rankings par
High Performance Interconnects: Assessment & RankingsHigh Performance Interconnects: Assessment & Rankings
High Performance Interconnects: Assessment & Rankingsinside-BigData.com
1.6K vues22 diapositives

Contenu connexe

Tendances

ARM HPC Ecosystem par
ARM HPC EcosystemARM HPC Ecosystem
ARM HPC Ecosysteminside-BigData.com
822 vues12 diapositives
BXI: Bull eXascale Interconnect par
BXI: Bull eXascale InterconnectBXI: Bull eXascale Interconnect
BXI: Bull eXascale Interconnectinside-BigData.com
2.1K vues20 diapositives
@IBM Power roadmap 8 par
@IBM Power roadmap 8 @IBM Power roadmap 8
@IBM Power roadmap 8 Diego Alberto Tamayo
837 vues19 diapositives
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant... par
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...Linaro
2.4K vues13 diapositives
An Update on Arm HPC par
An Update on Arm HPCAn Update on Arm HPC
An Update on Arm HPCinside-BigData.com
595 vues21 diapositives
Arm in HPC par
Arm in HPCArm in HPC
Arm in HPCinside-BigData.com
751 vues16 diapositives

Tendances(20)

Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant... par Linaro
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Linaro2.4K vues
Hardware & Software Platforms for HPC, AI and ML par inside-BigData.com
Hardware & Software Platforms for HPC, AI and MLHardware & Software Platforms for HPC, AI and ML
Hardware & Software Platforms for HPC, AI and ML
inside-BigData.com1.3K vues
SCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial par Ganesan Narayanasamy
SCFE 2020 OpenCAPI presentation as part of OpenPWOER TutorialSCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
SCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora par Linaro
Huawei’s requirements for the ARM based HPC solution readiness - Joshua MoraHuawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Linaro3.7K vues
Implementing AI: High Performance Architectures: Large scale HPC hardware in ... par KTN
Implementing AI: High Performance Architectures: Large scale HPC hardware in ...Implementing AI: High Performance Architectures: Large scale HPC hardware in ...
Implementing AI: High Performance Architectures: Large scale HPC hardware in ...
KTN128 vues
OpenHPC: A Comprehensive System Software Stack par inside-BigData.com
OpenHPC: A Comprehensive System Software StackOpenHPC: A Comprehensive System Software Stack
OpenHPC: A Comprehensive System Software Stack
inside-BigData.com3.7K vues
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria par Linaro
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta VekariaArm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Linaro3K vues

Similaire à Japan's post K Computer

The Why and How of HPC-Cloud Hybrids with OpenStack - Lev Lafayette, Universi... par
The Why and How of HPC-Cloud Hybrids with OpenStack - Lev Lafayette, Universi...The Why and How of HPC-Cloud Hybrids with OpenStack - Lev Lafayette, Universi...
The Why and How of HPC-Cloud Hybrids with OpenStack - Lev Lafayette, Universi...OpenStack
356 vues8 diapositives
OpenPOWER Acceleration of HPCC Systems par
OpenPOWER Acceleration of HPCC SystemsOpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC SystemsHPCC Systems
721 vues29 diapositives
Arm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AI par
Arm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AIArm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AI
Arm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AIinside-BigData.com
5K vues41 diapositives
NWU and HPC par
NWU and HPCNWU and HPC
NWU and HPCWilhelm van Belkum
1.2K vues42 diapositives
Journal Seminar: Is Singularity-based Container Technology Ready for Running ... par
Journal Seminar: Is Singularity-based Container Technology Ready for Running ...Journal Seminar: Is Singularity-based Container Technology Ready for Running ...
Journal Seminar: Is Singularity-based Container Technology Ready for Running ...Kento Aoyama
816 vues53 diapositives
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage par
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageWebinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageMayaData Inc
524 vues26 diapositives

Similaire à Japan's post K Computer(20)

The Why and How of HPC-Cloud Hybrids with OpenStack - Lev Lafayette, Universi... par OpenStack
The Why and How of HPC-Cloud Hybrids with OpenStack - Lev Lafayette, Universi...The Why and How of HPC-Cloud Hybrids with OpenStack - Lev Lafayette, Universi...
The Why and How of HPC-Cloud Hybrids with OpenStack - Lev Lafayette, Universi...
OpenStack356 vues
OpenPOWER Acceleration of HPCC Systems par HPCC Systems
OpenPOWER Acceleration of HPCC SystemsOpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC Systems
HPCC Systems721 vues
Arm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AI par inside-BigData.com
Arm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AIArm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AI
Arm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AI
Journal Seminar: Is Singularity-based Container Technology Ready for Running ... par Kento Aoyama
Journal Seminar: Is Singularity-based Container Technology Ready for Running ...Journal Seminar: Is Singularity-based Container Technology Ready for Running ...
Journal Seminar: Is Singularity-based Container Technology Ready for Running ...
Kento Aoyama816 vues
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage par MayaData Inc
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageWebinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
MayaData Inc524 vues
Exploring emerging technologies in the HPC co-design space par jsvetter
Exploring emerging technologies in the HPC co-design spaceExploring emerging technologies in the HPC co-design space
Exploring emerging technologies in the HPC co-design space
jsvetter2.1K vues
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.) par Heiko Joerg Schick
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
Designing HPC & Deep Learning Middleware for Exascale Systems par inside-BigData.com
Designing HPC & Deep Learning Middleware for Exascale SystemsDesigning HPC & Deep Learning Middleware for Exascale Systems
Designing HPC & Deep Learning Middleware for Exascale Systems
inside-BigData.com1.6K vues
OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ... par NETWAYS
OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...
OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...
NETWAYS108 vues
Seminar Accelerating Business Using Microservices Architecture in Digital Age... par PT Datacomm Diangraha
Seminar Accelerating Business Using Microservices Architecture in Digital Age...Seminar Accelerating Business Using Microservices Architecture in Digital Age...
Seminar Accelerating Business Using Microservices Architecture in Digital Age...
05 Preparing for Extreme Geterogeneity in HPC par RCCSRENKEI
05 Preparing for Extreme Geterogeneity in HPC05 Preparing for Extreme Geterogeneity in HPC
05 Preparing for Extreme Geterogeneity in HPC
RCCSRENKEI123 vues
A64fx and Fugaku - A Game Changing, HPC / AI Optimized Arm CPU to enable Exas... par inside-BigData.com
A64fx and Fugaku - A Game Changing, HPC / AI Optimized Arm CPU to enable Exas...A64fx and Fugaku - A Game Changing, HPC / AI Optimized Arm CPU to enable Exas...
A64fx and Fugaku - A Game Changing, HPC / AI Optimized Arm CPU to enable Exas...
inside-BigData.com6.4K vues
Designing HPC, Deep Learning, and Cloud Middleware for Exascale Systems par inside-BigData.com
Designing HPC, Deep Learning, and Cloud Middleware for Exascale SystemsDesigning HPC, Deep Learning, and Cloud Middleware for Exascale Systems
Designing HPC, Deep Learning, and Cloud Middleware for Exascale Systems
Superfluid Deployment of Virtual Functions: Exploiting Mobile Edge Computing ... par Stefano Salsano
Superfluid Deployment of Virtual Functions: Exploiting Mobile Edge Computing ...Superfluid Deployment of Virtual Functions: Exploiting Mobile Edge Computing ...
Superfluid Deployment of Virtual Functions: Exploiting Mobile Edge Computing ...
Stefano Salsano1.1K vues
From Rack scale computers to Warehouse scale computers par Ryousei Takano
From Rack scale computers to Warehouse scale computersFrom Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computers
Ryousei Takano5.2K vues

Plus de inside-BigData.com

Major Market Shifts in IT par
Major Market Shifts in ITMajor Market Shifts in IT
Major Market Shifts in ITinside-BigData.com
5K vues27 diapositives
Preparing to program Aurora at Exascale - Early experiences and future direct... par
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...inside-BigData.com
1.4K vues45 diapositives
Transforming Private 5G Networks par
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networksinside-BigData.com
1.3K vues14 diapositives
The Incorporation of Machine Learning into Scientific Simulations at Lawrence... par
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...inside-BigData.com
1.4K vues33 diapositives
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod... par
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...inside-BigData.com
1K vues47 diapositives
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ... par
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...inside-BigData.com
709 vues53 diapositives

Plus de inside-BigData.com(20)

Preparing to program Aurora at Exascale - Early experiences and future direct... par inside-BigData.com
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...
inside-BigData.com1.4K vues
The Incorporation of Machine Learning into Scientific Simulations at Lawrence... par inside-BigData.com
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
inside-BigData.com1.4K vues
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod... par inside-BigData.com
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ... par inside-BigData.com
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring par inside-BigData.com
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Fugaku Supercomputer joins fight against COVID-19 par inside-BigData.com
Fugaku Supercomputer joins fight against COVID-19Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19
inside-BigData.com2.1K vues
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD par inside-BigData.com
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODHPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
Versal Premium ACAP for Network and Cloud Acceleration par inside-BigData.com
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Acceleration
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently par inside-BigData.com
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
CUDA-Python and RAPIDS for blazing fast scientific computing par inside-BigData.com
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computing
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc... par inside-BigData.com
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...

Dernier

Scaling Knowledge Graph Architectures with AI par
Scaling Knowledge Graph Architectures with AIScaling Knowledge Graph Architectures with AI
Scaling Knowledge Graph Architectures with AIEnterprise Knowledge
24 vues15 diapositives
Black and White Modern Science Presentation.pptx par
Black and White Modern Science Presentation.pptxBlack and White Modern Science Presentation.pptx
Black and White Modern Science Presentation.pptxmaryamkhalid2916
14 vues21 diapositives
Kyo - Functional Scala 2023.pdf par
Kyo - Functional Scala 2023.pdfKyo - Functional Scala 2023.pdf
Kyo - Functional Scala 2023.pdfFlavio W. Brasil
165 vues92 diapositives
SAP Automation Using Bar Code and FIORI.pdf par
SAP Automation Using Bar Code and FIORI.pdfSAP Automation Using Bar Code and FIORI.pdf
SAP Automation Using Bar Code and FIORI.pdfVirendra Rai, PMP
19 vues38 diapositives
Case Study Copenhagen Energy and Business Central.pdf par
Case Study Copenhagen Energy and Business Central.pdfCase Study Copenhagen Energy and Business Central.pdf
Case Study Copenhagen Energy and Business Central.pdfAitana
12 vues3 diapositives
Info Session November 2023.pdf par
Info Session November 2023.pdfInfo Session November 2023.pdf
Info Session November 2023.pdfAleksandraKoprivica4
10 vues15 diapositives

Dernier(20)

Black and White Modern Science Presentation.pptx par maryamkhalid2916
Black and White Modern Science Presentation.pptxBlack and White Modern Science Presentation.pptx
Black and White Modern Science Presentation.pptx
Case Study Copenhagen Energy and Business Central.pdf par Aitana
Case Study Copenhagen Energy and Business Central.pdfCase Study Copenhagen Energy and Business Central.pdf
Case Study Copenhagen Energy and Business Central.pdf
Aitana12 vues
handbook for web 3 adoption.pdf par Liveplex
handbook for web 3 adoption.pdfhandbook for web 3 adoption.pdf
handbook for web 3 adoption.pdf
Liveplex19 vues
Web Dev - 1 PPT.pdf par gdsczhcet
Web Dev - 1 PPT.pdfWeb Dev - 1 PPT.pdf
Web Dev - 1 PPT.pdf
gdsczhcet55 vues
Spesifikasi Lengkap ASUS Vivobook Go 14 par Dot Semarang
Spesifikasi Lengkap ASUS Vivobook Go 14Spesifikasi Lengkap ASUS Vivobook Go 14
Spesifikasi Lengkap ASUS Vivobook Go 14
Dot Semarang35 vues
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N... par James Anderson
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...
James Anderson33 vues
Transcript: The Details of Description Techniques tips and tangents on altern... par BookNet Canada
Transcript: The Details of Description Techniques tips and tangents on altern...Transcript: The Details of Description Techniques tips and tangents on altern...
Transcript: The Details of Description Techniques tips and tangents on altern...
BookNet Canada130 vues
STPI OctaNE CoE Brochure.pdf par madhurjyapb
STPI OctaNE CoE Brochure.pdfSTPI OctaNE CoE Brochure.pdf
STPI OctaNE CoE Brochure.pdf
madhurjyapb12 vues
【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院 par IttrainingIttraining
【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院
【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院
The details of description: Techniques, tips, and tangents on alternative tex... par BookNet Canada
The details of description: Techniques, tips, and tangents on alternative tex...The details of description: Techniques, tips, and tangents on alternative tex...
The details of description: Techniques, tips, and tangents on alternative tex...
BookNet Canada121 vues
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas... par Bernd Ruecker
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
Bernd Ruecker26 vues

Japan's post K Computer

  • 1. Japan’s post K Computer Yutaka Ishikawa Project Leader RIKEN AICS HPC User Forum, 7th September, 2016
  • 2. Outline of Talk  Introduction of FLAGSHIP2020 project  An Overview of post K system  Concluding Remarks 2016/09/07 2HP User Forum @ Austin
  • 3. An Overview of Flagship 2020 project  Developing the next Japanese flagship computer, so-called “post K”  Developing a wide range of application codes, to run on the “post K”, to solve major social and science issues Vendor partner The Japanese government selected 9 social & scientific priority issues and their R&D organizations. 3
  • 4. Co-design 4 Target ApplicationsArchitectural Parameters • #SIMD, SIMD length, #core, #NUMA node • cache (size and bandwidth) • memory technologies • specialized hardware • Interconnect • I/O network
  • 5. Target Applications’ Characteristics 5 Target Application Program Brief description Co-design ① GENESIS MD for proteins Collective comm. (all-to-all), Floating point perf (FPP) ② Genomon Genome processing (Genome alignment) File I/O, Integer Perf. ③ GAMERA Earthquake simulator (FEM in unstructured & structured grid) Comm., Memory bandwidth ④ NICAM+LETK Weather prediction system using Big data (structured grid stencil & ensemble Kalman filter) Comm., Memory bandwidth, File I/O, SIMD ⑤ NTChem molecular electronic (structure calculation) Collective comm. (all-to-all, allreduce), FPP, SIMD, ⑥ FFB Large Eddy Simulation (unstructured grid) Comm., Memory bandwidth, ⑦ RSDFT an ab-initio program (density functional theory) Collective comm. (bcast), FFP ⑧ Adventure Computational Mechanics System for Large Scale Analysis and Design (unstructured grid) Comm., Memory bandwidth, SIMD ⑨ CCS-QCD Lattice QCD simulation (structured grid Monte Carlo) Comm., Memory bandwidth, Collective comm. (allreduce)
  • 6. Co-design 6 Architectural Parameters • #SIMD, SIMD length, #core, #NUMA node • cache (size and bandwidth) • memory technologies • specialized hardware • Interconnect • I/O network Target Applications  Mutual understanding both computer architecture/system software and applications  Looking at performance predictions  Finding out the best solution with constraints, e.g., power consumption, budget, and space Prediction of node-level performance Profiling applications, e.g., cache misses and execution unit usages Prediction Tool Prediction of scalability (Communication cost)
  • 7. R&D Organization 7 • DOE-MEXT • JLESC • … International Collaboration • HPCI Consortium • PC Cluster Consortium • OpenHPC • … Communities • Univ. of Tsukuba • Univ. of Tokyo • Univ. of Kyoto Domestic Collaboration
  • 8. Outline of Talk  Introduction of FLAGSHIP2020 project  An Overview of post K system  Concluding Remarks 2016/09/07 8HP User Forum @ Austin
  • 9. An Overview of post K  Hardware  Manycore architecture  6D mesh/torus Interconnect  3-level hierarchical storage system  Silicon Disk  Magnetic Disk  Storage for archive Login Servers Maintenance Servers I/O Network …… … … … … … … … … … … Hierarchical Storage System Portal Servers  System Software  Multi-Kernel: Linux with Light-weight Kernel  File I/O middleware for 3-level hierarchical storage system and application  Application-oriented file I/O middleware  MPI+OpenMP programming environment  Highly productive programing language and libraries 2016/09/07 9HP User Forum @ Austin
  • 10. What we have done  Software  OS functional design  Communication functional design  File I/O functional design  Programming languages  Mathematical libraries • Node architecture • System configuration • Storage system Continue to design  Hardware  Instruction set architecture 2016/09/07 10HP User Forum @ Austin
  • 11. Instruction Set Architecture  ARM V8 HPC Extension  Fujitsu is a lead partner of ARM HPC extension development  Detailed features were announced at Hot Chips 28 - 2016 http://www.hotchips.org/program/ Mon 8/22 Day1 9:45AM GPUs & HPCs ARMv8-A Next Generation Vector Architecture for HPC  Fujitsu’s inheritances  FMA  Math acceleration primitives  Inter core barrier  Sector cache  Hardware prefetch assist SVE (Scalable Vector Extension) 2016/09/07 11HP User Forum @ Austin
  • 12. OS Kernel  Requirements of OS Kernel targeting high-end HPC  Noiseless execution environment for bulk-synchronous applications  Ability to easily adapt to new/future system architectures  E.g.: manycore CPUs, heterogenous core architectures, deep memory hierarchy, etc. - New process/thread management, memory management, …  Ability to adapt to new/future application demand  Big-Data, in-situ applications - Support data flow from Internet devices to compute nodes - Optimize data movement Approach Pros. Cons. Full-Weight Kernel (FWK) e.g. Linux Disabling, removing, tuning, reimplementation, and adding new features Large community support results in rapid new hardware adaptation • Hard to implement a new feature if the original mechanism is conflicted with the new feature • Hard to follow the latest kernel distribution due to local large modifications Light-Weight Kernel (LWK) Implementation from scratch and adding new features Easy to extend it because of small in terms of logic and code size • Applications, running on FWK, cannot run always in LWK • Small community maintenance limits rapid growth • Lack of device drivers 12 Our Approach: Linux with Light-Weight Kernel
  • 13.  Partition resources (CPU cores, memory)  Full Linux kernel on some cores  System daemons and in-situ non HPC applications  Device drivers  Light-weight kernel(LWK), McKernel on other cores  HPC applications McKernel developed at RIKEN  McKernel is loadable module of Linux  McKernel supports Linux API  McKernel runs on  Intel Xeon and Xeon phi  Fujitsu FX10 13 Very simple memory management Thin LWK Process/Thread management General scheduler Complex Mem. Mngt. Linux TCP stack Dev. Drivers VFS File Sys Driers Memory … … Interrupt System daemons ? HPC Applications PartitionPartition In-situ non HPC application Linux API (glibc, /sys/, /proc/) Core Core Core Core Core Core McKernel is deployed to the Oakforest-PACS supercomputer, 25 PF in peak, at JCAHPC organized by U. of Tsukuba and U. of Tokyo 2016/09/07 HP User Forum @ Austin
  • 14. OS: McKernel 14 • Linux Kernel+Loadable LWK, McKernel – Linux Kernel is resident, and daemons for job scheduler and etc. run on Linux – McKernel is dynamically reloaded (rebooted) for each application Finish App A, requiring LWK-without- scheduler, Is invoked App B, requiring LWK-with-scheduler, Is invoked Finish AppC,usingfullLinux capability,Isinvoked Finish 2016/09/07 HP User Forum @ Austin
  • 15. OS: McKernel 15 https://asc.llnl.gov/sequoia/benchmarks Linux with isolcpus McKernel Results of FWQ (Fixed Work Quanta) 2016/09/07 HP User Forum @ Austin
  • 16. Concluding Remarks 16  Post K’s CPU is based on ARM V8 with HPC extension  The usability will be improved than the K computer by changing architecture  More wide-range community support  The system software stack for Post K is being designed and implemented with the leverage of international collaborations  The software stack developed at RIKEN is Open source  It also runs on Intel Xeon and Xeon phi  RIKEN would like to contribute to OpenHPC 2016/09/07 HP User Forum @ Austin