SlideShare une entreprise Scribd logo
1  sur  16
Télécharger pour lire hors ligne
Japan’s post K Computer
Yutaka Ishikawa
Project Leader
RIKEN AICS
HPC User Forum, 7th September, 2016
Outline of Talk
 Introduction of FLAGSHIP2020 project
 An Overview of post K system
 Concluding Remarks
2016/09/07 2HP User Forum @ Austin
An Overview of Flagship 2020 project
 Developing the next Japanese flagship
computer, so-called “post K”
 Developing a wide range of application
codes, to run on the “post K”, to solve
major social and science issues
Vendor partner
The Japanese government selected 9
social & scientific priority issues and
their R&D organizations.
3
Co-design
4
Target ApplicationsArchitectural Parameters
• #SIMD, SIMD length, #core, #NUMA node
• cache (size and bandwidth)
• memory technologies
• specialized hardware
• Interconnect
• I/O network
Target Applications’ Characteristics
5
Target Application
Program Brief description Co-design
① GENESIS MD for proteins Collective comm. (all-to-all), Floating point perf (FPP)
② Genomon Genome processing (Genome alignment) File I/O, Integer Perf.
③ GAMERA
Earthquake simulator (FEM in unstructured
& structured grid)
Comm., Memory bandwidth
④ NICAM+LETK
Weather prediction system using Big data
(structured grid stencil & ensemble Kalman
filter)
Comm., Memory bandwidth, File I/O, SIMD
⑤ NTChem molecular electronic (structure calculation) Collective comm. (all-to-all, allreduce), FPP, SIMD,
⑥ FFB Large Eddy Simulation (unstructured grid) Comm., Memory bandwidth,
⑦ RSDFT
an ab-initio program (density functional
theory)
Collective comm. (bcast), FFP
⑧ Adventure
Computational Mechanics System for Large
Scale Analysis and Design (unstructured
grid)
Comm., Memory bandwidth, SIMD
⑨ CCS-QCD
Lattice QCD simulation (structured grid
Monte Carlo)
Comm., Memory bandwidth, Collective comm. (allreduce)
Co-design
6
Architectural Parameters
• #SIMD, SIMD length, #core, #NUMA node
• cache (size and bandwidth)
• memory technologies
• specialized hardware
• Interconnect
• I/O network
Target Applications
 Mutual understanding both
computer architecture/system software and applications
 Looking at performance predictions
 Finding out the best solution with constraints, e.g., power
consumption, budget, and space
Prediction of node-level
performance
Profiling applications,
e.g., cache misses and
execution unit usages
Prediction Tool
Prediction of scalability
(Communication cost)
R&D Organization
7
• DOE-MEXT
• JLESC
• …
International
Collaboration
• HPCI Consortium
• PC Cluster Consortium
• OpenHPC
• …
Communities
• Univ. of Tsukuba
• Univ. of Tokyo
• Univ. of Kyoto
Domestic
Collaboration
Outline of Talk
 Introduction of FLAGSHIP2020 project
 An Overview of post K system
 Concluding Remarks
2016/09/07 8HP User Forum @ Austin
An Overview of post K
 Hardware
 Manycore architecture
 6D mesh/torus Interconnect
 3-level hierarchical storage system
 Silicon Disk
 Magnetic Disk
 Storage for archive
Login
Servers
Maintenance
Servers
I/O Network
……
…
…
…
…
…
…
…
…
…
… Hierarchical
Storage System
Portal
Servers
 System Software
 Multi-Kernel: Linux with Light-weight Kernel
 File I/O middleware for 3-level hierarchical storage
system and application
 Application-oriented file I/O middleware
 MPI+OpenMP programming environment
 Highly productive programing language and libraries
2016/09/07 9HP User Forum @ Austin
What we have done
 Software
 OS functional design
 Communication functional
design
 File I/O functional design
 Programming languages
 Mathematical libraries
• Node architecture
• System configuration
• Storage system
Continue to design
 Hardware
 Instruction set architecture
2016/09/07 10HP User Forum @ Austin
Instruction Set Architecture
 ARM V8 HPC Extension
 Fujitsu is a lead partner of ARM HPC extension development
 Detailed features were announced at Hot Chips 28 - 2016
http://www.hotchips.org/program/
Mon 8/22 Day1 9:45AM GPUs & HPCs
ARMv8-A Next Generation Vector Architecture for
HPC
 Fujitsu’s inheritances
 FMA
 Math acceleration primitives
 Inter core barrier
 Sector cache
 Hardware prefetch assist
SVE (Scalable Vector Extension)
2016/09/07 11HP User Forum @ Austin
OS Kernel
 Requirements of OS Kernel targeting high-end HPC
 Noiseless execution environment for bulk-synchronous applications
 Ability to easily adapt to new/future system architectures
 E.g.: manycore CPUs, heterogenous core architectures, deep memory
hierarchy, etc.
- New process/thread management, memory management, …
 Ability to adapt to new/future application demand
 Big-Data, in-situ applications
- Support data flow from Internet devices to compute nodes
- Optimize data movement
Approach Pros. Cons.
Full-Weight
Kernel
(FWK)
e.g. Linux
Disabling, removing, tuning,
reimplementation, and adding
new features
Large community support results in
rapid new hardware adaptation
• Hard to implement a new feature if the
original mechanism is conflicted with
the new feature
• Hard to follow the latest kernel
distribution due to local large
modifications
Light-Weight
Kernel
(LWK)
Implementation from scratch
and adding new features
Easy to extend it because of small in
terms of logic and code size
• Applications, running on FWK, cannot
run always in LWK
• Small community maintenance limits
rapid growth
• Lack of device drivers
12
Our Approach:
Linux with Light-Weight Kernel
 Partition resources (CPU cores,
memory)
 Full Linux kernel on some cores
 System daemons and in-situ non
HPC applications
 Device drivers
 Light-weight kernel(LWK),
McKernel on other cores
 HPC applications
McKernel developed at RIKEN
 McKernel is loadable module of Linux
 McKernel supports Linux API
 McKernel runs on
 Intel Xeon and Xeon phi
 Fujitsu FX10
13
Very simple
memory
management
Thin LWK
Process/Thread
management
General
scheduler
Complex
Mem. Mngt.
Linux
TCP stack
Dev. Drivers
VFS
File Sys
Driers
Memory
… …
Interrupt
System
daemons
?
HPC Applications
PartitionPartition
In-situ non HPC application
Linux API (glibc, /sys/, /proc/)
Core Core Core Core Core Core
McKernel is deployed to the Oakforest-PACS
supercomputer, 25 PF in peak, at JCAHPC
organized by U. of Tsukuba and U. of Tokyo
2016/09/07 HP User Forum @ Austin
OS: McKernel
14
• Linux Kernel+Loadable LWK, McKernel
– Linux Kernel is resident, and daemons for job scheduler and etc. run on Linux
– McKernel is dynamically reloaded (rebooted) for each application
Finish
App A, requiring
LWK-without-
scheduler, Is invoked
App B, requiring
LWK-with-scheduler,
Is invoked
Finish
AppC,usingfullLinux
capability,Isinvoked
Finish
2016/09/07 HP User Forum @ Austin
OS: McKernel
15
https://asc.llnl.gov/sequoia/benchmarks
Linux with isolcpus McKernel
Results of FWQ (Fixed Work Quanta)
2016/09/07 HP User Forum @ Austin
Concluding Remarks
16
 Post K’s CPU is based on ARM V8 with HPC extension
 The usability will be improved than the K computer by changing
architecture
 More wide-range community support
 The system software stack for Post K is being designed and implemented
with the leverage of international collaborations
 The software stack developed at RIKEN is Open source
 It also runs on Intel Xeon and Xeon phi
 RIKEN would like to contribute to OpenHPC
2016/09/07 HP User Forum @ Austin

Contenu connexe

Tendances

BXI: Bull eXascale Interconnect
BXI: Bull eXascale InterconnectBXI: Bull eXascale Interconnect
BXI: Bull eXascale Interconnectinside-BigData.com
 
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...Linaro
 
Hardware & Software Platforms for HPC, AI and ML
Hardware & Software Platforms for HPC, AI and MLHardware & Software Platforms for HPC, AI and ML
Hardware & Software Platforms for HPC, AI and MLinside-BigData.com
 
SCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
SCFE 2020 OpenCAPI presentation as part of OpenPWOER TutorialSCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
SCFE 2020 OpenCAPI presentation as part of OpenPWOER TutorialGanesan Narayanasamy
 
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua MoraHuawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua MoraLinaro
 
NNSA Explorations: ARM for Supercomputing
NNSA Explorations: ARM for SupercomputingNNSA Explorations: ARM for Supercomputing
NNSA Explorations: ARM for Supercomputinginside-BigData.com
 
Implementing AI: High Performance Architectures: Large scale HPC hardware in ...
Implementing AI: High Performance Architectures: Large scale HPC hardware in ...Implementing AI: High Performance Architectures: Large scale HPC hardware in ...
Implementing AI: High Performance Architectures: Large scale HPC hardware in ...KTN
 
OpenHPC: A Comprehensive System Software Stack
OpenHPC: A Comprehensive System Software StackOpenHPC: A Comprehensive System Software Stack
OpenHPC: A Comprehensive System Software Stackinside-BigData.com
 
Open Hardware and Future Computing
Open Hardware and Future ComputingOpen Hardware and Future Computing
Open Hardware and Future ComputingGanesan Narayanasamy
 
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta VekariaArm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta VekariaLinaro
 

Tendances (20)

ARM HPC Ecosystem
ARM HPC EcosystemARM HPC Ecosystem
ARM HPC Ecosystem
 
BXI: Bull eXascale Interconnect
BXI: Bull eXascale InterconnectBXI: Bull eXascale Interconnect
BXI: Bull eXascale Interconnect
 
@IBM Power roadmap 8
@IBM Power roadmap 8 @IBM Power roadmap 8
@IBM Power roadmap 8
 
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
 
An Update on Arm HPC
An Update on Arm HPCAn Update on Arm HPC
An Update on Arm HPC
 
Arm in HPC
Arm in HPCArm in HPC
Arm in HPC
 
POWER9 for AI & HPC
POWER9 for AI & HPCPOWER9 for AI & HPC
POWER9 for AI & HPC
 
Overview of HPC Interconnects
Overview of HPC InterconnectsOverview of HPC Interconnects
Overview of HPC Interconnects
 
OpenPOWER Latest Updates
OpenPOWER Latest UpdatesOpenPOWER Latest Updates
OpenPOWER Latest Updates
 
Hardware & Software Platforms for HPC, AI and ML
Hardware & Software Platforms for HPC, AI and MLHardware & Software Platforms for HPC, AI and ML
Hardware & Software Platforms for HPC, AI and ML
 
Sx 6-single-node
Sx 6-single-nodeSx 6-single-node
Sx 6-single-node
 
SCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
SCFE 2020 OpenCAPI presentation as part of OpenPWOER TutorialSCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
SCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
 
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua MoraHuawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
 
NNSA Explorations: ARM for Supercomputing
NNSA Explorations: ARM for SupercomputingNNSA Explorations: ARM for Supercomputing
NNSA Explorations: ARM for Supercomputing
 
Implementing AI: High Performance Architectures: Large scale HPC hardware in ...
Implementing AI: High Performance Architectures: Large scale HPC hardware in ...Implementing AI: High Performance Architectures: Large scale HPC hardware in ...
Implementing AI: High Performance Architectures: Large scale HPC hardware in ...
 
Ac922 cdac webinar
Ac922 cdac webinarAc922 cdac webinar
Ac922 cdac webinar
 
OpenHPC: A Comprehensive System Software Stack
OpenHPC: A Comprehensive System Software StackOpenHPC: A Comprehensive System Software Stack
OpenHPC: A Comprehensive System Software Stack
 
Open Hardware and Future Computing
Open Hardware and Future ComputingOpen Hardware and Future Computing
Open Hardware and Future Computing
 
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta VekariaArm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
 
POWER10 innovations for HPC
POWER10 innovations for HPCPOWER10 innovations for HPC
POWER10 innovations for HPC
 

Similaire à Japan's post K supercomputer overview

The Why and How of HPC-Cloud Hybrids with OpenStack - Lev Lafayette, Universi...
The Why and How of HPC-Cloud Hybrids with OpenStack - Lev Lafayette, Universi...The Why and How of HPC-Cloud Hybrids with OpenStack - Lev Lafayette, Universi...
The Why and How of HPC-Cloud Hybrids with OpenStack - Lev Lafayette, Universi...OpenStack
 
OpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC SystemsOpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC SystemsHPCC Systems
 
Arm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AI
Arm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AIArm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AI
Arm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AIinside-BigData.com
 
Journal Seminar: Is Singularity-based Container Technology Ready for Running ...
Journal Seminar: Is Singularity-based Container Technology Ready for Running ...Journal Seminar: Is Singularity-based Container Technology Ready for Running ...
Journal Seminar: Is Singularity-based Container Technology Ready for Running ...Kento Aoyama
 
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageWebinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageMayaData Inc
 
Systems Support for Many Task Computing
Systems Support for Many Task ComputingSystems Support for Many Task Computing
Systems Support for Many Task ComputingEric Van Hensbergen
 
Exploring emerging technologies in the HPC co-design space
Exploring emerging technologies in the HPC co-design spaceExploring emerging technologies in the HPC co-design space
Exploring emerging technologies in the HPC co-design spacejsvetter
 
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)Heiko Joerg Schick
 
Designing HPC & Deep Learning Middleware for Exascale Systems
Designing HPC & Deep Learning Middleware for Exascale SystemsDesigning HPC & Deep Learning Middleware for Exascale Systems
Designing HPC & Deep Learning Middleware for Exascale Systemsinside-BigData.com
 
OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...
OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...
OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...NETWAYS
 
Seminar Accelerating Business Using Microservices Architecture in Digital Age...
Seminar Accelerating Business Using Microservices Architecture in Digital Age...Seminar Accelerating Business Using Microservices Architecture in Digital Age...
Seminar Accelerating Business Using Microservices Architecture in Digital Age...PT Datacomm Diangraha
 
05 Preparing for Extreme Geterogeneity in HPC
05 Preparing for Extreme Geterogeneity in HPC05 Preparing for Extreme Geterogeneity in HPC
05 Preparing for Extreme Geterogeneity in HPCRCCSRENKEI
 
A64fx and Fugaku - A Game Changing, HPC / AI Optimized Arm CPU to enable Exas...
A64fx and Fugaku - A Game Changing, HPC / AI Optimized Arm CPU to enable Exas...A64fx and Fugaku - A Game Changing, HPC / AI Optimized Arm CPU to enable Exas...
A64fx and Fugaku - A Game Changing, HPC / AI Optimized Arm CPU to enable Exas...inside-BigData.com
 
Ap 06 4_10_simek
Ap 06 4_10_simekAp 06 4_10_simek
Ap 06 4_10_simekNguyen Vinh
 
Designing HPC, Deep Learning, and Cloud Middleware for Exascale Systems
Designing HPC, Deep Learning, and Cloud Middleware for Exascale SystemsDesigning HPC, Deep Learning, and Cloud Middleware for Exascale Systems
Designing HPC, Deep Learning, and Cloud Middleware for Exascale Systemsinside-BigData.com
 
Superfluid Deployment of Virtual Functions: Exploiting Mobile Edge Computing ...
Superfluid Deployment of Virtual Functions: Exploiting Mobile Edge Computing ...Superfluid Deployment of Virtual Functions: Exploiting Mobile Edge Computing ...
Superfluid Deployment of Virtual Functions: Exploiting Mobile Edge Computing ...Stefano Salsano
 
From Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computersFrom Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computersRyousei Takano
 

Similaire à Japan's post K supercomputer overview (20)

The Why and How of HPC-Cloud Hybrids with OpenStack - Lev Lafayette, Universi...
The Why and How of HPC-Cloud Hybrids with OpenStack - Lev Lafayette, Universi...The Why and How of HPC-Cloud Hybrids with OpenStack - Lev Lafayette, Universi...
The Why and How of HPC-Cloud Hybrids with OpenStack - Lev Lafayette, Universi...
 
OpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC SystemsOpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC Systems
 
Arm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AI
Arm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AIArm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AI
Arm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AI
 
NWU and HPC
NWU and HPCNWU and HPC
NWU and HPC
 
Journal Seminar: Is Singularity-based Container Technology Ready for Running ...
Journal Seminar: Is Singularity-based Container Technology Ready for Running ...Journal Seminar: Is Singularity-based Container Technology Ready for Running ...
Journal Seminar: Is Singularity-based Container Technology Ready for Running ...
 
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageWebinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
 
Systems Support for Many Task Computing
Systems Support for Many Task ComputingSystems Support for Many Task Computing
Systems Support for Many Task Computing
 
LPC4300_two_cores
LPC4300_two_coresLPC4300_two_cores
LPC4300_two_cores
 
Exploring emerging technologies in the HPC co-design space
Exploring emerging technologies in the HPC co-design spaceExploring emerging technologies in the HPC co-design space
Exploring emerging technologies in the HPC co-design space
 
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
 
ODP Presentation LinuxCon NA 2014
ODP Presentation LinuxCon NA 2014ODP Presentation LinuxCon NA 2014
ODP Presentation LinuxCon NA 2014
 
Designing HPC & Deep Learning Middleware for Exascale Systems
Designing HPC & Deep Learning Middleware for Exascale SystemsDesigning HPC & Deep Learning Middleware for Exascale Systems
Designing HPC & Deep Learning Middleware for Exascale Systems
 
OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...
OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...
OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...
 
Seminar Accelerating Business Using Microservices Architecture in Digital Age...
Seminar Accelerating Business Using Microservices Architecture in Digital Age...Seminar Accelerating Business Using Microservices Architecture in Digital Age...
Seminar Accelerating Business Using Microservices Architecture in Digital Age...
 
05 Preparing for Extreme Geterogeneity in HPC
05 Preparing for Extreme Geterogeneity in HPC05 Preparing for Extreme Geterogeneity in HPC
05 Preparing for Extreme Geterogeneity in HPC
 
A64fx and Fugaku - A Game Changing, HPC / AI Optimized Arm CPU to enable Exas...
A64fx and Fugaku - A Game Changing, HPC / AI Optimized Arm CPU to enable Exas...A64fx and Fugaku - A Game Changing, HPC / AI Optimized Arm CPU to enable Exas...
A64fx and Fugaku - A Game Changing, HPC / AI Optimized Arm CPU to enable Exas...
 
Ap 06 4_10_simek
Ap 06 4_10_simekAp 06 4_10_simek
Ap 06 4_10_simek
 
Designing HPC, Deep Learning, and Cloud Middleware for Exascale Systems
Designing HPC, Deep Learning, and Cloud Middleware for Exascale SystemsDesigning HPC, Deep Learning, and Cloud Middleware for Exascale Systems
Designing HPC, Deep Learning, and Cloud Middleware for Exascale Systems
 
Superfluid Deployment of Virtual Functions: Exploiting Mobile Edge Computing ...
Superfluid Deployment of Virtual Functions: Exploiting Mobile Edge Computing ...Superfluid Deployment of Virtual Functions: Exploiting Mobile Edge Computing ...
Superfluid Deployment of Virtual Functions: Exploiting Mobile Edge Computing ...
 
From Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computersFrom Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computers
 

Plus de inside-BigData.com

Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...inside-BigData.com
 
Transforming Private 5G Networks
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networksinside-BigData.com
 
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...inside-BigData.com
 
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...inside-BigData.com
 
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...inside-BigData.com
 
HPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networksinside-BigData.com
 
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoringinside-BigData.com
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecastsinside-BigData.com
 
HPC AI Advisory Council Update
HPC AI Advisory Council UpdateHPC AI Advisory Council Update
HPC AI Advisory Council Updateinside-BigData.com
 
Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19inside-BigData.com
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuninginside-BigData.com
 
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODHPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODinside-BigData.com
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Accelerationinside-BigData.com
 
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficientlyinside-BigData.com
 
Scaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's EraScaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's Erainside-BigData.com
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computinginside-BigData.com
 
Introducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Clusterinside-BigData.com
 
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...inside-BigData.com
 

Plus de inside-BigData.com (20)

Major Market Shifts in IT
Major Market Shifts in ITMajor Market Shifts in IT
Major Market Shifts in IT
 
Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...
 
Transforming Private 5G Networks
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networks
 
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
 
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
 
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
 
HPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networks
 
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecasts
 
HPC AI Advisory Council Update
HPC AI Advisory Council UpdateHPC AI Advisory Council Update
HPC AI Advisory Council Update
 
Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuning
 
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODHPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
 
State of ARM-based HPC
State of ARM-based HPCState of ARM-based HPC
State of ARM-based HPC
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Acceleration
 
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
 
Scaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's EraScaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's Era
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computing
 
Introducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Cluster
 
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
 

Dernier

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 

Dernier (20)

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 

Japan's post K supercomputer overview

  • 1. Japan’s post K Computer Yutaka Ishikawa Project Leader RIKEN AICS HPC User Forum, 7th September, 2016
  • 2. Outline of Talk  Introduction of FLAGSHIP2020 project  An Overview of post K system  Concluding Remarks 2016/09/07 2HP User Forum @ Austin
  • 3. An Overview of Flagship 2020 project  Developing the next Japanese flagship computer, so-called “post K”  Developing a wide range of application codes, to run on the “post K”, to solve major social and science issues Vendor partner The Japanese government selected 9 social & scientific priority issues and their R&D organizations. 3
  • 4. Co-design 4 Target ApplicationsArchitectural Parameters • #SIMD, SIMD length, #core, #NUMA node • cache (size and bandwidth) • memory technologies • specialized hardware • Interconnect • I/O network
  • 5. Target Applications’ Characteristics 5 Target Application Program Brief description Co-design ① GENESIS MD for proteins Collective comm. (all-to-all), Floating point perf (FPP) ② Genomon Genome processing (Genome alignment) File I/O, Integer Perf. ③ GAMERA Earthquake simulator (FEM in unstructured & structured grid) Comm., Memory bandwidth ④ NICAM+LETK Weather prediction system using Big data (structured grid stencil & ensemble Kalman filter) Comm., Memory bandwidth, File I/O, SIMD ⑤ NTChem molecular electronic (structure calculation) Collective comm. (all-to-all, allreduce), FPP, SIMD, ⑥ FFB Large Eddy Simulation (unstructured grid) Comm., Memory bandwidth, ⑦ RSDFT an ab-initio program (density functional theory) Collective comm. (bcast), FFP ⑧ Adventure Computational Mechanics System for Large Scale Analysis and Design (unstructured grid) Comm., Memory bandwidth, SIMD ⑨ CCS-QCD Lattice QCD simulation (structured grid Monte Carlo) Comm., Memory bandwidth, Collective comm. (allreduce)
  • 6. Co-design 6 Architectural Parameters • #SIMD, SIMD length, #core, #NUMA node • cache (size and bandwidth) • memory technologies • specialized hardware • Interconnect • I/O network Target Applications  Mutual understanding both computer architecture/system software and applications  Looking at performance predictions  Finding out the best solution with constraints, e.g., power consumption, budget, and space Prediction of node-level performance Profiling applications, e.g., cache misses and execution unit usages Prediction Tool Prediction of scalability (Communication cost)
  • 7. R&D Organization 7 • DOE-MEXT • JLESC • … International Collaboration • HPCI Consortium • PC Cluster Consortium • OpenHPC • … Communities • Univ. of Tsukuba • Univ. of Tokyo • Univ. of Kyoto Domestic Collaboration
  • 8. Outline of Talk  Introduction of FLAGSHIP2020 project  An Overview of post K system  Concluding Remarks 2016/09/07 8HP User Forum @ Austin
  • 9. An Overview of post K  Hardware  Manycore architecture  6D mesh/torus Interconnect  3-level hierarchical storage system  Silicon Disk  Magnetic Disk  Storage for archive Login Servers Maintenance Servers I/O Network …… … … … … … … … … … … Hierarchical Storage System Portal Servers  System Software  Multi-Kernel: Linux with Light-weight Kernel  File I/O middleware for 3-level hierarchical storage system and application  Application-oriented file I/O middleware  MPI+OpenMP programming environment  Highly productive programing language and libraries 2016/09/07 9HP User Forum @ Austin
  • 10. What we have done  Software  OS functional design  Communication functional design  File I/O functional design  Programming languages  Mathematical libraries • Node architecture • System configuration • Storage system Continue to design  Hardware  Instruction set architecture 2016/09/07 10HP User Forum @ Austin
  • 11. Instruction Set Architecture  ARM V8 HPC Extension  Fujitsu is a lead partner of ARM HPC extension development  Detailed features were announced at Hot Chips 28 - 2016 http://www.hotchips.org/program/ Mon 8/22 Day1 9:45AM GPUs & HPCs ARMv8-A Next Generation Vector Architecture for HPC  Fujitsu’s inheritances  FMA  Math acceleration primitives  Inter core barrier  Sector cache  Hardware prefetch assist SVE (Scalable Vector Extension) 2016/09/07 11HP User Forum @ Austin
  • 12. OS Kernel  Requirements of OS Kernel targeting high-end HPC  Noiseless execution environment for bulk-synchronous applications  Ability to easily adapt to new/future system architectures  E.g.: manycore CPUs, heterogenous core architectures, deep memory hierarchy, etc. - New process/thread management, memory management, …  Ability to adapt to new/future application demand  Big-Data, in-situ applications - Support data flow from Internet devices to compute nodes - Optimize data movement Approach Pros. Cons. Full-Weight Kernel (FWK) e.g. Linux Disabling, removing, tuning, reimplementation, and adding new features Large community support results in rapid new hardware adaptation • Hard to implement a new feature if the original mechanism is conflicted with the new feature • Hard to follow the latest kernel distribution due to local large modifications Light-Weight Kernel (LWK) Implementation from scratch and adding new features Easy to extend it because of small in terms of logic and code size • Applications, running on FWK, cannot run always in LWK • Small community maintenance limits rapid growth • Lack of device drivers 12 Our Approach: Linux with Light-Weight Kernel
  • 13.  Partition resources (CPU cores, memory)  Full Linux kernel on some cores  System daemons and in-situ non HPC applications  Device drivers  Light-weight kernel(LWK), McKernel on other cores  HPC applications McKernel developed at RIKEN  McKernel is loadable module of Linux  McKernel supports Linux API  McKernel runs on  Intel Xeon and Xeon phi  Fujitsu FX10 13 Very simple memory management Thin LWK Process/Thread management General scheduler Complex Mem. Mngt. Linux TCP stack Dev. Drivers VFS File Sys Driers Memory … … Interrupt System daemons ? HPC Applications PartitionPartition In-situ non HPC application Linux API (glibc, /sys/, /proc/) Core Core Core Core Core Core McKernel is deployed to the Oakforest-PACS supercomputer, 25 PF in peak, at JCAHPC organized by U. of Tsukuba and U. of Tokyo 2016/09/07 HP User Forum @ Austin
  • 14. OS: McKernel 14 • Linux Kernel+Loadable LWK, McKernel – Linux Kernel is resident, and daemons for job scheduler and etc. run on Linux – McKernel is dynamically reloaded (rebooted) for each application Finish App A, requiring LWK-without- scheduler, Is invoked App B, requiring LWK-with-scheduler, Is invoked Finish AppC,usingfullLinux capability,Isinvoked Finish 2016/09/07 HP User Forum @ Austin
  • 15. OS: McKernel 15 https://asc.llnl.gov/sequoia/benchmarks Linux with isolcpus McKernel Results of FWQ (Fixed Work Quanta) 2016/09/07 HP User Forum @ Austin
  • 16. Concluding Remarks 16  Post K’s CPU is based on ARM V8 with HPC extension  The usability will be improved than the K computer by changing architecture  More wide-range community support  The system software stack for Post K is being designed and implemented with the leverage of international collaborations  The software stack developed at RIKEN is Open source  It also runs on Intel Xeon and Xeon phi  RIKEN would like to contribute to OpenHPC 2016/09/07 HP User Forum @ Austin