State of ARM-based HPC

inside-BigData.com
inside-BigData.comPresident of insideHPC Media à inside-BigData.com
State of ARM-based HPC
LTD20-106
24 March 2020
Welcome!
1. This is not our first rodeo…
a. Mont Blanc -
https://www.montblanc-project.eu/wp-content/uploads/2017/12/UCHPC_Presentation_PDF_
lw.pdf
b. Linaro Connect -
http://connect.linaro.org.s3.amazonaws.com/sfo17/Presentations/SFO17-200K1.pdf
c. Linaro Connect - https://connect.linaro.org/resources/san19/san19-400k1/
d. Arm - https://developer.arm.com/solutions/hpc
2. The question of whether Aarch64/Arm64 can do HPC is a resounding Yes!
Typical components of a HPC
1. Common components.
a. As near identical configuration per node as possible.
b. A method of interconnecting nodes.
2. A job scheduler.
a. Slurm workload manager
b. Univa grid engine
c. ...and others or ways to parallelise across nodes.
3. CPU / RAM / Interconnect / Storage
Is that enough?
Components
1. Core volume/density.
a. We used to count the number of simultaneous processes by the number of physical
CPUs.
i. In each node we look at number of CPUs
ii. The number of cores
iii. The number of threads
1. Is threading intentionally disabled?
iv. Is NUMA supported?
v. Whether those CPUs are cache-coherent.
2. Levels of Cache
L0 - Macro-op cache
L1 - for each core
L2 - for each cluster of cores
L3 - for each cluster of CPUs
L1,L2,L3 Cache have separate Instruction and Data elements.
Chips
● Arm v8.0-A (Advanced Neon, SIMD 32 x 128bit)
○ Ampere eMag 8180
○ Cavium ThunderX
○ Qualcomm Kryo
● Arm v8.1-A
○ Marvell ThunderX2 (28core variant) - Astra Supercomputer (dual-socket)
○ Marvell ThunderX2 (32core variant) - Isambard Supercomputer (dual-socket)
● Arm v8.2-A
○ Arm NeoverseN1
○ Fujitsu A64FX (+SVE) - Fugaku Supercomputer (single-socket)
○ Huawei Kunpeng 920
○ NVidia Carmel
○ Ampere Altra (v8.2+)
● Arm v8.3-A (SIMD Complex Number rotation support and Nested Virtualisation support)
○ Marvell ThunderX3 (v8.3+) 2020
○ Huawei Kunpeng 930 (almost v8.4 + SVE) 2021
https://en.wikipedia.org/wiki/ARM_architecture
Chips
● Arm v8.6-A (Neoverse N2 ‘Zeus’ to be used in the European Processor Initiative)
○ General Matrix Multiply (GEMM)
○ Bfloat16 format support
○ SIMD matrix manipulation instructions, BFDOT, BFMMLA, BFMLAL and BFCVT
○ Enhancements for virtualization, system management and security
● Arm SVE2
○ Fine-grained data-level parallelism
Support for v8.6-A and SVE2 to be in GCC 10 and LLVM CLANG 9
Announced April 2019
https://en.wikipedia.org/wiki/ARM_architecture
RISC, CISC, ACCELERATOR
● The ARM ISA is a RISC implementation
○ Do simple operations highly efficiently.
○ Each operation takes one clock cycle, enables pipelining.
● A CISC implementation
○ Do simple instructions like RISC but have additional complex instructions that take more
than one clock cycle. Pipelining is more cumbersome.
● Accelerators
○ Do bespoke actions as quick as possible, even asynchronously.
● The Challenge,
○ Can an ARM ISA extended with accelerator-style operations be as effective as a CISC +
plug-in Accelerator?
Interconnects
● Between upto 128 cores there is ARM CMN600 - Coherent Mesh Network for single chassis
● Between chassis there are:
○ PCIe
○ CCIX
○ CXL?
○ Ares
○ Tofu
● Network options
○ InfiniBand - Low latency
○ Ethernet
Adaptive Compute Acceleration
https://www.xilinx.com/products/silicon-devices/acap/versal-premium.html
Resilience
● ECC Memory
● Dual power-supplies
● Core fault sensing
● ...Containers?
Blending Containers
● Containers are packaged environments to enable the easy execution of applications by
supplying its dependencies within.
● Multiple containers can work together as building blocks of a larger solution.
● Subject to operational requirements, containers can be built to run on a variety of platforms.
○ From SBC to HPC!
● With the right sort of scheduler system and orchestration tool jobs become:
○ Auto-built/tested
○ Parallelised
○ Flexible
○ Scalable
○ On-demand
Storage is still required...
● DRAM is volatile
● Virtual disks ephemeral
● Diskless nodes
● Persistent storage is still needed:
○ File systems
■ Ext4,lvm,xfs,zfs
○ Parallel file systems
■ Lustre
○ Distributed storage
■ CEPH
○ Media
■ Conventional disks
■ SSD,nvme
Applications
What does HPC enable...
● 292 Libraries/Applications tested for Aarch64 -
https://gitlab.com/arm-hpc/packages/-/wikis/home
● Weather prediction
○ Although Scalable Probabilistic approximation might be more efficient…
https://advances.sciencemag.org/content/6/5/eaaw0961
● Molecular Dynamics
○ GROMACS supports SIMD NEON operations
○ https://redmine.gromacs.org/issues/2806 SIMD algorithms for ARM SVE scheduled for
2021.
● AI
All things Cloud...
● IDC - Worldwide Server Market Revenue Declined 11.6% Year Over Year in the Second Quarter
of 2019 https://www.idc.com/getdoc.jsp?containerId=prUS45482519
● COVID-19 pandemic causes Stock Market falls of 20% (Mar.2020).
https://www.wired.com/story/covid-19-spreads-listen-stock-market/
● Working remotely is now the norm.
● Scalable on-demand services brings Serverless Computing.
The Linaro Datacenter & Cloud Group (LDCG)
● Common development center for the Arm
Server & Infrastructure ecosystem
● Eliminates fragmentation, reduces cost
and accelerates time to market
● Members can focus on innovation and
differentiated value-add
● Working on core open-source software for
ARM servers
○ Server architecture – UEFI/ACPI/ServerReady
○ ARMv8 enablement & optimization
○ Big Data, BigTop, Hadoop and Spark
○ Cloud Infrastructure such as Kubernetes,
OpenStack and Ceph
Linaro Developer Cloud
Enterprise-class Arm Powered
servers hosted in UK are available for
development, test, CI and cloud
deployments for VM and containers.
www.linaro.cloud
Lower deployment & management barriers
Leverage the Linaro Developer Cloud and other services to develop
cost-effective Cloud-integrated HPC development frameworks and generate
reference implementations to accelerate
Member-driven with Advisory Board
Members determine work completed by engineering resources while advisory
board provides subject matter expertise on HPC requirements and guidance
and feedback on ongoing HPC SIG strategic direction and roadmap
Driving datacenter-class, open-source HPC development on Arm
Identify and adopt standards to make HPC deployment on Arm a commercial
imperative. Develop real-world use cases that reap the benefits of Arm while
ensuring interoperability, modularization, orchestration
LDCG High Performance Computing (HPC) SIG
Collaborative project building on the work of the Linaro Datacenter & Cloud Group
HPC
Functions-as-a-Service
● Linaro HPC hardware being reconfigured towards a scalable environment.
○ A combination of OpenStack, K8S and OpenHPC.
○ A testbed to verify combinations of heterogeneous ingredients for the optimal recipes.
● Service Consumers
○ Send the service request and receive the service answer.
○ The service consumer will be CPU,GPU,ISA,Accelerator agnostic!
If the equipment is billed as pay-per-use then it’s our challenge to ensure that Aarch64
solutions match a significant number of requests.
Thank you
Continuing to accelerate deployment of your
Arm-based solutions through collaboration
hpc@linaro.org
1 sur 18

Recommandé

CUDA-Python and RAPIDS for blazing fast scientific computing par
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computinginside-BigData.com
1K vues82 diapositives
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod... par
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...inside-BigData.com
1K vues47 diapositives
Preparing to program Aurora at Exascale - Early experiences and future direct... par
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...inside-BigData.com
1.4K vues45 diapositives
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD par
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODHPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODinside-BigData.com
838 vues25 diapositives
Overview of HPC Interconnects par
Overview of HPC InterconnectsOverview of HPC Interconnects
Overview of HPC Interconnectsinside-BigData.com
354 vues35 diapositives
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ... par
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...inside-BigData.com
709 vues53 diapositives

Contenu connexe

Tendances

Energy Efficient Computing using Dynamic Tuning par
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuninginside-BigData.com
454 vues48 diapositives
OpenPOWER System Marconi100 par
OpenPOWER System Marconi100OpenPOWER System Marconi100
OpenPOWER System Marconi100Ganesan Narayanasamy
490 vues12 diapositives
IBM HPC Transformation with AI par
IBM HPC Transformation with AI IBM HPC Transformation with AI
IBM HPC Transformation with AI Ganesan Narayanasamy
540 vues27 diapositives
Programming Models for Exascale Systems par
Programming Models for Exascale SystemsProgramming Models for Exascale Systems
Programming Models for Exascale Systemsinside-BigData.com
671 vues62 diapositives
Introducing HPC with a Raspberry Pi Cluster par
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Clusterinside-BigData.com
3K vues18 diapositives
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ... par
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...Ganesan Narayanasamy
547 vues47 diapositives

Tendances(20)

OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ... par Ganesan Narayanasamy
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora par Linaro
Huawei’s requirements for the ARM based HPC solution readiness - Joshua MoraHuawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Linaro3.7K vues
SCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial par Ganesan Narayanasamy
SCFE 2020 OpenCAPI presentation as part of OpenPWOER TutorialSCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
SCFE 2020 OpenCAPI presentation as part of OpenPWOER Tutorial
Hardware & Software Platforms for HPC, AI and ML par inside-BigData.com
Hardware & Software Platforms for HPC, AI and MLHardware & Software Platforms for HPC, AI and ML
Hardware & Software Platforms for HPC, AI and ML
inside-BigData.com1.3K vues
Data Plane Evolution: Towards Openness and Flexibility par APNIC
Data Plane Evolution: Towards Openness and FlexibilityData Plane Evolution: Towards Openness and Flexibility
Data Plane Evolution: Towards Openness and Flexibility
APNIC612 vues
Lightweight Virtualized Containers For Open Platform for NFV* (OPNFV*) par Michelle Holley
Lightweight Virtualized Containers For Open Platform for NFV* (OPNFV*)Lightweight Virtualized Containers For Open Platform for NFV* (OPNFV*)
Lightweight Virtualized Containers For Open Platform for NFV* (OPNFV*)
Michelle Holley528 vues
Mellanox Announces HDR 200 Gb/s InfiniBand Solutions par inside-BigData.com
Mellanox Announces HDR 200 Gb/s InfiniBand SolutionsMellanox Announces HDR 200 Gb/s InfiniBand Solutions
Mellanox Announces HDR 200 Gb/s InfiniBand Solutions
inside-BigData.com2.1K vues
Scaling the Container Dataplane par Michelle Holley
Scaling the Container Dataplane Scaling the Container Dataplane
Scaling the Container Dataplane
Michelle Holley1.7K vues

Similaire à State of ARM-based HPC

Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage par
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageWebinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageMayaData Inc
525 vues26 diapositives
Arm - ceph on arm update par
Arm - ceph on arm updateArm - ceph on arm update
Arm - ceph on arm updateinwin stack
240 vues17 diapositives
20141111_SOS3_Gallo par
20141111_SOS3_Gallo20141111_SOS3_Gallo
20141111_SOS3_GalloAndrea Gallo
706 vues56 diapositives
OpenStack Best Practices and Considerations - terasky tech day par
OpenStack Best Practices and Considerations  - terasky tech dayOpenStack Best Practices and Considerations  - terasky tech day
OpenStack Best Practices and Considerations - terasky tech dayArthur Berezin
14.4K vues40 diapositives
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space? par
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?ArangoDB Database
1.7K vues63 diapositives
BPF & Cilium - Turning Linux into a Microservices-aware Operating System par
BPF  & Cilium - Turning Linux into a Microservices-aware Operating SystemBPF  & Cilium - Turning Linux into a Microservices-aware Operating System
BPF & Cilium - Turning Linux into a Microservices-aware Operating SystemThomas Graf
2.9K vues27 diapositives

Similaire à State of ARM-based HPC(20)

Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage par MayaData Inc
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageWebinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
MayaData Inc525 vues
Arm - ceph on arm update par inwin stack
Arm - ceph on arm updateArm - ceph on arm update
Arm - ceph on arm update
inwin stack240 vues
OpenStack Best Practices and Considerations - terasky tech day par Arthur Berezin
OpenStack Best Practices and Considerations  - terasky tech dayOpenStack Best Practices and Considerations  - terasky tech day
OpenStack Best Practices and Considerations - terasky tech day
Arthur Berezin14.4K vues
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space? par ArangoDB Database
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?
ArangoDB Database1.7K vues
BPF & Cilium - Turning Linux into a Microservices-aware Operating System par Thomas Graf
BPF  & Cilium - Turning Linux into a Microservices-aware Operating SystemBPF  & Cilium - Turning Linux into a Microservices-aware Operating System
BPF & Cilium - Turning Linux into a Microservices-aware Operating System
Thomas Graf2.9K vues
Introduction to HPC & Supercomputing in AI par Tyrone Systems
Introduction to HPC & Supercomputing in AIIntroduction to HPC & Supercomputing in AI
Introduction to HPC & Supercomputing in AI
Tyrone Systems213 vues
CEPH DAY BERLIN - WHAT'S NEW IN CEPH par Ceph Community
CEPH DAY BERLIN - WHAT'S NEW IN CEPH CEPH DAY BERLIN - WHAT'S NEW IN CEPH
CEPH DAY BERLIN - WHAT'S NEW IN CEPH
Ceph Community 717 vues
Introduction to containers a practical session using core os and docker par Alessandro Martellone
Introduction to containers  a practical session using core os and dockerIntroduction to containers  a practical session using core os and docker
Introduction to containers a practical session using core os and docker
OpenStack and Kubernetes - A match made for Telco Heaven par Trinath Somanchi
OpenStack and Kubernetes - A match made for Telco HeavenOpenStack and Kubernetes - A match made for Telco Heaven
OpenStack and Kubernetes - A match made for Telco Heaven
Trinath Somanchi1.4K vues
Red Hat multi-cluster management & what's new in OpenShift par Kangaroot
Red Hat multi-cluster management & what's new in OpenShiftRed Hat multi-cluster management & what's new in OpenShift
Red Hat multi-cluster management & what's new in OpenShift
Kangaroot334 vues
Run PostgreSQL in Warp Speed Using NVMe/TCP in the Cloud par DoKC
Run PostgreSQL in Warp Speed Using NVMe/TCP in the CloudRun PostgreSQL in Warp Speed Using NVMe/TCP in the Cloud
Run PostgreSQL in Warp Speed Using NVMe/TCP in the Cloud
DoKC4 vues
TSC BoF: OSS Toolchain Discussion - SFO17-409 par Linaro
TSC BoF: OSS Toolchain Discussion - SFO17-409TSC BoF: OSS Toolchain Discussion - SFO17-409
TSC BoF: OSS Toolchain Discussion - SFO17-409
Linaro375 vues
Edge Computing: A Unified Infrastructure for all the Different Pieces par Cloudify Community
Edge Computing: A Unified Infrastructure for all the Different PiecesEdge Computing: A Unified Infrastructure for all the Different Pieces
Edge Computing: A Unified Infrastructure for all the Different Pieces
AIST Super Green Cloud: lessons learned from the operation and the performanc... par Ryousei Takano
AIST Super Green Cloud: lessons learned from the operation and the performanc...AIST Super Green Cloud: lessons learned from the operation and the performanc...
AIST Super Green Cloud: lessons learned from the operation and the performanc...
Ryousei Takano1.2K vues
Evaluating GPU programming Models for the LUMI Supercomputer par George Markomanolis
Evaluating GPU programming Models for the LUMI SupercomputerEvaluating GPU programming Models for the LUMI Supercomputer
Evaluating GPU programming Models for the LUMI Supercomputer
Exploring the Programming Models for the LUMI Supercomputer par George Markomanolis
Exploring the Programming Models for the LUMI Supercomputer Exploring the Programming Models for the LUMI Supercomputer
Exploring the Programming Models for the LUMI Supercomputer
Deep Learning on ARM Platforms - SFO17-509 par Linaro
Deep Learning on ARM Platforms - SFO17-509Deep Learning on ARM Platforms - SFO17-509
Deep Learning on ARM Platforms - SFO17-509
Linaro5.3K vues
Optimizing Servers for High-Throughput and Low-Latency at Dropbox par ScyllaDB
Optimizing Servers for High-Throughput and Low-Latency at DropboxOptimizing Servers for High-Throughput and Low-Latency at Dropbox
Optimizing Servers for High-Throughput and Low-Latency at Dropbox
ScyllaDB706 vues

Plus de inside-BigData.com

Major Market Shifts in IT par
Major Market Shifts in ITMajor Market Shifts in IT
Major Market Shifts in ITinside-BigData.com
5K vues27 diapositives
Transforming Private 5G Networks par
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networksinside-BigData.com
1.3K vues14 diapositives
The Incorporation of Machine Learning into Scientific Simulations at Lawrence... par
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...inside-BigData.com
1.4K vues33 diapositives
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring par
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoringinside-BigData.com
456 vues70 diapositives
Machine Learning for Weather Forecasts par
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecastsinside-BigData.com
3K vues26 diapositives
HPC AI Advisory Council Update par
HPC AI Advisory Council UpdateHPC AI Advisory Council Update
HPC AI Advisory Council Updateinside-BigData.com
427 vues5 diapositives

Plus de inside-BigData.com(20)

The Incorporation of Machine Learning into Scientific Simulations at Lawrence... par inside-BigData.com
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
inside-BigData.com1.4K vues
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring par inside-BigData.com
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Fugaku Supercomputer joins fight against COVID-19 par inside-BigData.com
Fugaku Supercomputer joins fight against COVID-19Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19
inside-BigData.com2.1K vues
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently par inside-BigData.com
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc... par inside-BigData.com
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
Efficient Model Selection for Deep Neural Networks on Massively Parallel Proc...
Scientific Applications and Heterogeneous Architectures par inside-BigData.com
Scientific Applications and Heterogeneous ArchitecturesScientific Applications and Heterogeneous Architectures
Scientific Applications and Heterogeneous Architectures
DGX SuperPOD: Instant Infrastructure for AI Leadership par inside-BigData.com
DGX SuperPOD: Instant Infrastructure for AI LeadershipDGX SuperPOD: Instant Infrastructure for AI Leadership
DGX SuperPOD: Instant Infrastructure for AI Leadership
The Parallel Computing Revolution Is Only Half Over par inside-BigData.com
The Parallel Computing Revolution Is Only Half OverThe Parallel Computing Revolution Is Only Half Over
The Parallel Computing Revolution Is Only Half Over
Single-Cell Sequencing for Drug Discovery: Applications and Challenges par inside-BigData.com
Single-Cell Sequencing for Drug Discovery: Applications and ChallengesSingle-Cell Sequencing for Drug Discovery: Applications and Challenges
Single-Cell Sequencing for Drug Discovery: Applications and Challenges
inside-BigData.com1.6K vues

Dernier

6g - REPORT.pdf par
6g - REPORT.pdf6g - REPORT.pdf
6g - REPORT.pdfLiveplex
10 vues23 diapositives
PRODUCT LISTING.pptx par
PRODUCT LISTING.pptxPRODUCT LISTING.pptx
PRODUCT LISTING.pptxangelicacueva6
14 vues1 diapositive
Serverless computing with Google Cloud (2023-24) par
Serverless computing with Google Cloud (2023-24)Serverless computing with Google Cloud (2023-24)
Serverless computing with Google Cloud (2023-24)wesley chun
11 vues33 diapositives
TouchLog: Finger Micro Gesture Recognition Using Photo-Reflective Sensors par
TouchLog: Finger Micro Gesture Recognition  Using Photo-Reflective SensorsTouchLog: Finger Micro Gesture Recognition  Using Photo-Reflective Sensors
TouchLog: Finger Micro Gesture Recognition Using Photo-Reflective Sensorssugiuralab
21 vues15 diapositives
The Research Portal of Catalonia: Growing more (information) & more (services) par
The Research Portal of Catalonia: Growing more (information) & more (services)The Research Portal of Catalonia: Growing more (information) & more (services)
The Research Portal of Catalonia: Growing more (information) & more (services)CSUC - Consorci de Serveis Universitaris de Catalunya
80 vues25 diapositives
Future of AR - Facebook Presentation par
Future of AR - Facebook PresentationFuture of AR - Facebook Presentation
Future of AR - Facebook Presentationssuserb54b561
15 vues27 diapositives

Dernier(20)

6g - REPORT.pdf par Liveplex
6g - REPORT.pdf6g - REPORT.pdf
6g - REPORT.pdf
Liveplex10 vues
Serverless computing with Google Cloud (2023-24) par wesley chun
Serverless computing with Google Cloud (2023-24)Serverless computing with Google Cloud (2023-24)
Serverless computing with Google Cloud (2023-24)
wesley chun11 vues
TouchLog: Finger Micro Gesture Recognition Using Photo-Reflective Sensors par sugiuralab
TouchLog: Finger Micro Gesture Recognition  Using Photo-Reflective SensorsTouchLog: Finger Micro Gesture Recognition  Using Photo-Reflective Sensors
TouchLog: Finger Micro Gesture Recognition Using Photo-Reflective Sensors
sugiuralab21 vues
Future of AR - Facebook Presentation par ssuserb54b561
Future of AR - Facebook PresentationFuture of AR - Facebook Presentation
Future of AR - Facebook Presentation
ssuserb54b56115 vues
Special_edition_innovator_2023.pdf par WillDavies22
Special_edition_innovator_2023.pdfSpecial_edition_innovator_2023.pdf
Special_edition_innovator_2023.pdf
WillDavies2218 vues
Unit 1_Lecture 2_Physical Design of IoT.pdf par StephenTec
Unit 1_Lecture 2_Physical Design of IoT.pdfUnit 1_Lecture 2_Physical Design of IoT.pdf
Unit 1_Lecture 2_Physical Design of IoT.pdf
StephenTec12 vues
Powerful Google developer tools for immediate impact! (2023-24) par wesley chun
Powerful Google developer tools for immediate impact! (2023-24)Powerful Google developer tools for immediate impact! (2023-24)
Powerful Google developer tools for immediate impact! (2023-24)
wesley chun10 vues
Piloting & Scaling Successfully With Microsoft Viva par Richard Harbridge
Piloting & Scaling Successfully With Microsoft VivaPiloting & Scaling Successfully With Microsoft Viva
Piloting & Scaling Successfully With Microsoft Viva
Data Integrity for Banking and Financial Services par Precisely
Data Integrity for Banking and Financial ServicesData Integrity for Banking and Financial Services
Data Integrity for Banking and Financial Services
Precisely25 vues
Igniting Next Level Productivity with AI-Infused Data Integration Workflows par Safe Software
Igniting Next Level Productivity with AI-Infused Data Integration Workflows Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Safe Software280 vues
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f... par TrustArc
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc11 vues

State of ARM-based HPC

  • 1. State of ARM-based HPC LTD20-106 24 March 2020
  • 2. Welcome! 1. This is not our first rodeo… a. Mont Blanc - https://www.montblanc-project.eu/wp-content/uploads/2017/12/UCHPC_Presentation_PDF_ lw.pdf b. Linaro Connect - http://connect.linaro.org.s3.amazonaws.com/sfo17/Presentations/SFO17-200K1.pdf c. Linaro Connect - https://connect.linaro.org/resources/san19/san19-400k1/ d. Arm - https://developer.arm.com/solutions/hpc 2. The question of whether Aarch64/Arm64 can do HPC is a resounding Yes!
  • 3. Typical components of a HPC 1. Common components. a. As near identical configuration per node as possible. b. A method of interconnecting nodes. 2. A job scheduler. a. Slurm workload manager b. Univa grid engine c. ...and others or ways to parallelise across nodes. 3. CPU / RAM / Interconnect / Storage Is that enough?
  • 4. Components 1. Core volume/density. a. We used to count the number of simultaneous processes by the number of physical CPUs. i. In each node we look at number of CPUs ii. The number of cores iii. The number of threads 1. Is threading intentionally disabled? iv. Is NUMA supported? v. Whether those CPUs are cache-coherent. 2. Levels of Cache L0 - Macro-op cache L1 - for each core L2 - for each cluster of cores L3 - for each cluster of CPUs L1,L2,L3 Cache have separate Instruction and Data elements.
  • 5. Chips ● Arm v8.0-A (Advanced Neon, SIMD 32 x 128bit) ○ Ampere eMag 8180 ○ Cavium ThunderX ○ Qualcomm Kryo ● Arm v8.1-A ○ Marvell ThunderX2 (28core variant) - Astra Supercomputer (dual-socket) ○ Marvell ThunderX2 (32core variant) - Isambard Supercomputer (dual-socket) ● Arm v8.2-A ○ Arm NeoverseN1 ○ Fujitsu A64FX (+SVE) - Fugaku Supercomputer (single-socket) ○ Huawei Kunpeng 920 ○ NVidia Carmel ○ Ampere Altra (v8.2+) ● Arm v8.3-A (SIMD Complex Number rotation support and Nested Virtualisation support) ○ Marvell ThunderX3 (v8.3+) 2020 ○ Huawei Kunpeng 930 (almost v8.4 + SVE) 2021 https://en.wikipedia.org/wiki/ARM_architecture
  • 6. Chips ● Arm v8.6-A (Neoverse N2 ‘Zeus’ to be used in the European Processor Initiative) ○ General Matrix Multiply (GEMM) ○ Bfloat16 format support ○ SIMD matrix manipulation instructions, BFDOT, BFMMLA, BFMLAL and BFCVT ○ Enhancements for virtualization, system management and security ● Arm SVE2 ○ Fine-grained data-level parallelism Support for v8.6-A and SVE2 to be in GCC 10 and LLVM CLANG 9 Announced April 2019 https://en.wikipedia.org/wiki/ARM_architecture
  • 7. RISC, CISC, ACCELERATOR ● The ARM ISA is a RISC implementation ○ Do simple operations highly efficiently. ○ Each operation takes one clock cycle, enables pipelining. ● A CISC implementation ○ Do simple instructions like RISC but have additional complex instructions that take more than one clock cycle. Pipelining is more cumbersome. ● Accelerators ○ Do bespoke actions as quick as possible, even asynchronously. ● The Challenge, ○ Can an ARM ISA extended with accelerator-style operations be as effective as a CISC + plug-in Accelerator?
  • 8. Interconnects ● Between upto 128 cores there is ARM CMN600 - Coherent Mesh Network for single chassis ● Between chassis there are: ○ PCIe ○ CCIX ○ CXL? ○ Ares ○ Tofu ● Network options ○ InfiniBand - Low latency ○ Ethernet
  • 10. Resilience ● ECC Memory ● Dual power-supplies ● Core fault sensing ● ...Containers?
  • 11. Blending Containers ● Containers are packaged environments to enable the easy execution of applications by supplying its dependencies within. ● Multiple containers can work together as building blocks of a larger solution. ● Subject to operational requirements, containers can be built to run on a variety of platforms. ○ From SBC to HPC! ● With the right sort of scheduler system and orchestration tool jobs become: ○ Auto-built/tested ○ Parallelised ○ Flexible ○ Scalable ○ On-demand
  • 12. Storage is still required... ● DRAM is volatile ● Virtual disks ephemeral ● Diskless nodes ● Persistent storage is still needed: ○ File systems ■ Ext4,lvm,xfs,zfs ○ Parallel file systems ■ Lustre ○ Distributed storage ■ CEPH ○ Media ■ Conventional disks ■ SSD,nvme
  • 13. Applications What does HPC enable... ● 292 Libraries/Applications tested for Aarch64 - https://gitlab.com/arm-hpc/packages/-/wikis/home ● Weather prediction ○ Although Scalable Probabilistic approximation might be more efficient… https://advances.sciencemag.org/content/6/5/eaaw0961 ● Molecular Dynamics ○ GROMACS supports SIMD NEON operations ○ https://redmine.gromacs.org/issues/2806 SIMD algorithms for ARM SVE scheduled for 2021. ● AI
  • 14. All things Cloud... ● IDC - Worldwide Server Market Revenue Declined 11.6% Year Over Year in the Second Quarter of 2019 https://www.idc.com/getdoc.jsp?containerId=prUS45482519 ● COVID-19 pandemic causes Stock Market falls of 20% (Mar.2020). https://www.wired.com/story/covid-19-spreads-listen-stock-market/ ● Working remotely is now the norm. ● Scalable on-demand services brings Serverless Computing.
  • 15. The Linaro Datacenter & Cloud Group (LDCG) ● Common development center for the Arm Server & Infrastructure ecosystem ● Eliminates fragmentation, reduces cost and accelerates time to market ● Members can focus on innovation and differentiated value-add ● Working on core open-source software for ARM servers ○ Server architecture – UEFI/ACPI/ServerReady ○ ARMv8 enablement & optimization ○ Big Data, BigTop, Hadoop and Spark ○ Cloud Infrastructure such as Kubernetes, OpenStack and Ceph Linaro Developer Cloud Enterprise-class Arm Powered servers hosted in UK are available for development, test, CI and cloud deployments for VM and containers. www.linaro.cloud
  • 16. Lower deployment & management barriers Leverage the Linaro Developer Cloud and other services to develop cost-effective Cloud-integrated HPC development frameworks and generate reference implementations to accelerate Member-driven with Advisory Board Members determine work completed by engineering resources while advisory board provides subject matter expertise on HPC requirements and guidance and feedback on ongoing HPC SIG strategic direction and roadmap Driving datacenter-class, open-source HPC development on Arm Identify and adopt standards to make HPC deployment on Arm a commercial imperative. Develop real-world use cases that reap the benefits of Arm while ensuring interoperability, modularization, orchestration LDCG High Performance Computing (HPC) SIG Collaborative project building on the work of the Linaro Datacenter & Cloud Group HPC
  • 17. Functions-as-a-Service ● Linaro HPC hardware being reconfigured towards a scalable environment. ○ A combination of OpenStack, K8S and OpenHPC. ○ A testbed to verify combinations of heterogeneous ingredients for the optimal recipes. ● Service Consumers ○ Send the service request and receive the service answer. ○ The service consumer will be CPU,GPU,ISA,Accelerator agnostic! If the equipment is billed as pay-per-use then it’s our challenge to ensure that Aarch64 solutions match a significant number of requests.
  • 18. Thank you Continuing to accelerate deployment of your Arm-based solutions through collaboration hpc@linaro.org