SlideShare une entreprise Scribd logo
1  sur  27
Télécharger pour lire hors ligne
QNIBTerminal plus InfiniBand 
Containerized MPI Workloads 
insideHPC Edition 
Slides slightly modified in comparison 
to the HPC Advisory Council 
Christian Kniep 
2014-11-05
Agenda 
• Docker in a Nutshell 
• QNIBTerminal 
• Testbed 
• MPI Benchmark 
• HPCG-Results 
• Future Work 
• Conclusion 
2
Docker in a Nutshell 
3 
• (chroot on steroids)2
Docker in a Nutshell 
• Builds on-top LinuX Containers (LXC) 
• Kernel namespaces (isolation) 
• cgroups (resource mgmt) 
4 
• (chroot on steroids)2
Docker in a Nutshell 
• intuitive build system 
5 
• (chroot on steroids)2 
• Builds on-top LinuX Containers (LXC) 
• Kernel namespaces (isolation) 
• cgroups (resource mgmt)
Docker in a Nutshell 
• intuitive build system 
• public repositories 
• RedHat backing 
6 
• (chroot on steroids)2 
• Builds on-top LinuX Containers (LXC) 
• Kernel namespaces (isolation) 
• cgroups (resource mgmt)
Traditional vs. Lightweight 
Layers 
7 
SERVICE SERVICE 
Userland (OS) Userland (OS) Userland (OS) 
KERNEL KERNEL 
HYPERVISOR 
HOST KERNEL 
SERVER 
SERVICE 
KERNEL 
Userland (OS) 
SERVICE SERVICE 
Userland (OS) Userland (OS) Userland (OS) 
HOST KERNEL 
SERVER 
SERVICE 
Userland (OS) 
Traditional Virtualisation Containerisation 
IB 
IB
QNIBTerminal 
Motivation 
8 
Plain Metrics
QNIBTerminal 
Motivation 
9 
Plain Log Events
QNIBTerminal 
Motivation 
10 
Overlap Metrics/Log Events
QNIBTerminal 
Overview 
11 
One Node Setup 
• All network traffic over bridge 
• Crippled MPI workload 
elk 
dns 
elasticsearch 
logstash 
kibana 
helixdns 
etcd 
haproxy haproxy 
Compute 
slurmd compute<N> 
slurmd compute0 
slurmctld slurmctld 
grafana grafana 
graphite-api graphite-api 
graphite-web graphite-web 
carbon carbon 
Log/Events 
Services Performance
• 3 containers on top (CentOS 6, CentOS 7, Ubuntu 12) 
• SLURM Resource Scheduler 
• 1 native partition 
• 3 containers partitions 
• Multiple Open MPI version installed 
• gcc versions 
Testbed 
12 
• 8 nodes (CentOS 7, 2x 4core XEON, 32GB, Mellanox ConnectX-2)
MPI Benchmark 
• osu-micro-benchmarks-4.4.1 
• osu_alltoall with two tasks on two hosts 
$ mpirun -np 2 -H venus001,venus002 $(pwd)/osu_alltoall 
# OSU MPI All-to-All Personalized Exchange Latency Test v4.4.1 
# Size Avg Latency(us) 
1 1.83 
2 1.82 
4 1.74 
8 1.63 
16 1.62 
32 1.68 
64 1.80 
128 2.77 
256 3.11 
512 3.51 
MPI benchmark was not in original HPC Advisory Council Presentation 13
MPI Benchmark 
distribution’s results [2 task @2nodes] 
14 
latency [us] 
5 
4 
3 
2 
1 
0 
native cos7 cos6 u12 
4 8 16 32 64 128 256 512 1024 
Message Size (KB) 
MPI benchmark was not in original HPC Advisory Council Presentation
15 
latency [us] 
2,8 
2,1 
1,4 
0,7 
0 
MPI Benchmark 
Open MPI comparison [2 task @2nodes, avg(1B->64B)] 
distribution 1.5.4 1.6.4 1.8.3 
native 
cos7 
cos6 
u12 
oMPI 1.6.4 
oMPI 1.6.4 
oMPI 1.5.4 
oMPI 1.5.4
• mimics thermodynamic application workload 
• Linpack corrective / successor in the long-term? 
16 
HPCG Benchmark
17 
HPCG Benchmark 
6 
5,25 
GFLOP/s 3 
4,5 
3,75 
distribution’s results 
native cos7 cos6 u12 
CentOS 7.0 
oMPI 1.6.4 
gcc 4.8.2
18 
HPCG Benchmark 
6 
5,25 
GFLOP/s 3 
4,5 
3,75 
native cos7 cos6 u12 
CentOS 7.0 
oMPI 1.6.4 
gcc 4.8.2 
CentOS 6.5 
oMPI 1.5.4 
gcc 4.4.7 
Ubuntu12.04 
oMPI 1.5.4 
gcc 4.6.3 
distribution’s results
6 
5,25 
HPCG Benchmark 
s 
GFLOP/4,5 
3,75 
3 
distribution 
19 native 
cos7 
cos6 
u12 
oMPI 1.6.4 
oMPI 1.6.4 
oMPI 1.5.4 
oMPI 1.5.4 
Open MPI comparison
6 
5,25 
HPCG Benchmark 
s 
GFLOP/4,5 
3,75 
3 
distribution 1.6.4 1.8.4 
20 native 
cos7 
cos6 
u12 
oMPI 1.6.4 
oMPI 1.6.4 
oMPI 1.5.4 
oMPI 1.5.4 
gcc 4.8.2 
gcc 4.8.2 
gcc 4.4.7 
gcc 4.6.3 
Open MPI comparison
6 
5,25 
HPCG Benchmark 
s 
GFLOP/4,5 
3,75 
3 
distribution 1.5.4 1.6.4 1.8.4 
21 native 
cos7 
cos6 
u12 
oMPI 1.6.4 
oMPI 1.6.4 
oMPI 1.5.4 
oMPI 1.5.4 
gcc 4.8.2 
gcc 4.8.2 
gcc 4.4.7 
gcc 4.6.3 
Open MPI comparison
Future Work 
• Compare with tuned bare-metal 
• Tune docker installation 
• Use of SV-IOR (Keynote earlier today) 
• Compare different frameworks to orchestrate 
• Security evaluations 
22 
• Benchmark real-world applications
Conclusion 
• Abstraction bare-metal / application works fine 
• Bare-metal kernel provides access to IB 
• Container in charge from MPI upwards 
• Out-of-the-box: container beats bare-metal 
• Continuous testing/deployment of containerized workloads 
23 
• Bunch of tooling within docker ecosystem 
• Low performance overhead
La Fin 
• Contact 
• @CQnib / @qnibinc 
• christian@qnib.org 
• http://qnib.org 
24 
https://www.flickr.com/photos/dharmabum1964/3108162671
La Fin 
• Contact 
• @CQnib / @qnibinc 
• christian@qnib.org 
• http://qnib.org 
• Paper: http://doc.qnib.org/ 
25 
https://www.flickr.com/photos/dharmabum1964/3108162671
La Fin 
26 
https://www.flickr.com/photos/dharmabum1964/3108162671 
• Contact 
• @CQnib / @qnibinc 
• christian@qnib.org 
• http://qnib.org 
• Paper: http://doc.qnib.org/ 
• Interested? 
• Docker Pitch today 
• Internal Evaluations 
• Workshops / Talks
La Fin 
27 
https://www.flickr.com/photos/dharmabum1964/3108162671 
• Contact 
• @CQnib / @_qnib 
• christian@qnib.org 
• http://qnib.org 
• Paper: http://doc.qnib.org/ 
• Interested? 
• Docker Pitch today 
• Internal Evaluations 
• Workshops / Talks 
• Questions?

Contenu connexe

Tendances

A Performance Comparison of Container-based Virtualization Systems for MapRed...
A Performance Comparison of Container-based Virtualization Systems for MapRed...A Performance Comparison of Container-based Virtualization Systems for MapRed...
A Performance Comparison of Container-based Virtualization Systems for MapRed...
Marcelo Veiga Neves
 

Tendances (20)

CRI, OCI, and CRI-O
CRI, OCI, and CRI-OCRI, OCI, and CRI-O
CRI, OCI, and CRI-O
 
K8s storage-glusterfs-20180210
K8s storage-glusterfs-20180210K8s storage-glusterfs-20180210
K8s storage-glusterfs-20180210
 
N problems of Linux containers
N problems of Linux containersN problems of Linux containers
N problems of Linux containers
 
FOSDEM2015: Live migration for containers is around the corner
FOSDEM2015: Live migration for containers is around the cornerFOSDEM2015: Live migration for containers is around the corner
FOSDEM2015: Live migration for containers is around the corner
 
iptables and Kubernetes
iptables and Kubernetesiptables and Kubernetes
iptables and Kubernetes
 
Leveraging the Power of containerd Events - Evan Hazlett
Leveraging the Power of containerd Events - Evan HazlettLeveraging the Power of containerd Events - Evan Hazlett
Leveraging the Power of containerd Events - Evan Hazlett
 
Kubernetes CRI containerd integration by Lantao Liu (Google)
Kubernetes CRI containerd integration by Lantao Liu (Google)Kubernetes CRI containerd integration by Lantao Liu (Google)
Kubernetes CRI containerd integration by Lantao Liu (Google)
 
Porting and Optimization of Numerical Libraries for ARM SVE
Porting and Optimization of Numerical Libraries for ARM SVEPorting and Optimization of Numerical Libraries for ARM SVE
Porting and Optimization of Numerical Libraries for ARM SVE
 
Docker London Meetup: Docker Engine Evolution
Docker London Meetup: Docker Engine EvolutionDocker London Meetup: Docker Engine Evolution
Docker London Meetup: Docker Engine Evolution
 
Ceph on Windows
Ceph on WindowsCeph on Windows
Ceph on Windows
 
もうひとつのコンテナ実行環境 runq のご紹介
もうひとつのコンテナ実行環境 runq のご紹介もうひとつのコンテナ実行環境 runq のご紹介
もうひとつのコンテナ実行環境 runq のご紹介
 
Scalability and Performance of CNS 3.6
Scalability and Performance of CNS 3.6Scalability and Performance of CNS 3.6
Scalability and Performance of CNS 3.6
 
The relationship between Docker, Kubernetes and CRI
The relationship between Docker, Kubernetes and CRIThe relationship between Docker, Kubernetes and CRI
The relationship between Docker, Kubernetes and CRI
 
A Performance Comparison of Container-based Virtualization Systems for MapRed...
A Performance Comparison of Container-based Virtualization Systems for MapRed...A Performance Comparison of Container-based Virtualization Systems for MapRed...
A Performance Comparison of Container-based Virtualization Systems for MapRed...
 
How to Achieve Canary Deployment on Kubernetes
How to Achieve Canary Deployment on KubernetesHow to Achieve Canary Deployment on Kubernetes
How to Achieve Canary Deployment on Kubernetes
 
Let's Try Every CRI Runtime Available for Kubernetes
Let's Try Every CRI Runtime Available for KubernetesLet's Try Every CRI Runtime Available for Kubernetes
Let's Try Every CRI Runtime Available for Kubernetes
 
Enabling Security via Container Runtimes
Enabling Security via Container RuntimesEnabling Security via Container Runtimes
Enabling Security via Container Runtimes
 
Criu texas-linux-fest-2014
Criu texas-linux-fest-2014Criu texas-linux-fest-2014
Criu texas-linux-fest-2014
 
HPC in a Box - Docker Workshop at ISC 2015
HPC in a Box - Docker Workshop at ISC 2015HPC in a Box - Docker Workshop at ISC 2015
HPC in a Box - Docker Workshop at ISC 2015
 
Containerize ovs ovn components
Containerize ovs ovn componentsContainerize ovs ovn components
Containerize ovs ovn components
 

Similaire à QNIBTerminal Plus InfiniBand - Containerized MPI Workloads

The state of Spark in the cloud
The state of Spark in the cloudThe state of Spark in the cloud
The state of Spark in the cloud
Nicolas Poggi
 
bfgasnet_pr-v2
bfgasnet_pr-v2bfgasnet_pr-v2
bfgasnet_pr-v2
Zeus G
 

Similaire à QNIBTerminal Plus InfiniBand - Containerized MPI Workloads (20)

The state of Hive and Spark in the Cloud (July 2017)
The state of Hive and Spark in the Cloud (July 2017)The state of Hive and Spark in the Cloud (July 2017)
The state of Hive and Spark in the Cloud (July 2017)
 
LCU14 303- Toolchain Collaboration
LCU14 303- Toolchain CollaborationLCU14 303- Toolchain Collaboration
LCU14 303- Toolchain Collaboration
 
OpenDaylight Integration with OpenStack Neutron: A Tutorial
OpenDaylight Integration with OpenStack Neutron: A TutorialOpenDaylight Integration with OpenStack Neutron: A Tutorial
OpenDaylight Integration with OpenStack Neutron: A Tutorial
 
The state of Spark in the cloud
The state of Spark in the cloudThe state of Spark in the cloud
The state of Spark in the cloud
 
LCU14 310- Cisco ODP v2
LCU14 310- Cisco ODP v2LCU14 310- Cisco ODP v2
LCU14 310- Cisco ODP v2
 
Exploring Github Data with Apache Drill on ARM64
Exploring Github Data with Apache Drill on ARM64 Exploring Github Data with Apache Drill on ARM64
Exploring Github Data with Apache Drill on ARM64
 
bfgasnet_pr-v2
bfgasnet_pr-v2bfgasnet_pr-v2
bfgasnet_pr-v2
 
Kubernetes 101
Kubernetes 101Kubernetes 101
Kubernetes 101
 
Workday's Next Generation Private Cloud
Workday's Next Generation Private CloudWorkday's Next Generation Private Cloud
Workday's Next Generation Private Cloud
 
Streaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit LogStreaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit Log
 
OpenEBS hangout #4
OpenEBS hangout #4OpenEBS hangout #4
OpenEBS hangout #4
 
SFO15-110: Toolchain Collaboration
SFO15-110: Toolchain CollaborationSFO15-110: Toolchain Collaboration
SFO15-110: Toolchain Collaboration
 
Kernel Recipes 2017 - What's new in the world of storage for Linux - Jens Axboe
Kernel Recipes 2017 - What's new in the world of storage for Linux - Jens AxboeKernel Recipes 2017 - What's new in the world of storage for Linux - Jens Axboe
Kernel Recipes 2017 - What's new in the world of storage for Linux - Jens Axboe
 
Testing, CI Gating & Community Fast Feedback: The Challenge of Integration Pr...
Testing, CI Gating & Community Fast Feedback: The Challenge of Integration Pr...Testing, CI Gating & Community Fast Feedback: The Challenge of Integration Pr...
Testing, CI Gating & Community Fast Feedback: The Challenge of Integration Pr...
 
Harnessing OpenCL in Modern Coprocessors
Harnessing OpenCL in Modern CoprocessorsHarnessing OpenCL in Modern Coprocessors
Harnessing OpenCL in Modern Coprocessors
 
Kernel Recipes 2018 - Live (Kernel) Patching: status quo and status futurus -...
Kernel Recipes 2018 - Live (Kernel) Patching: status quo and status futurus -...Kernel Recipes 2018 - Live (Kernel) Patching: status quo and status futurus -...
Kernel Recipes 2018 - Live (Kernel) Patching: status quo and status futurus -...
 
Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017
 
SCALE12X: Chef for OpenStack
SCALE12X: Chef for OpenStackSCALE12X: Chef for OpenStack
SCALE12X: Chef for OpenStack
 
LCA14: LCA14-412: GPGPU on ARM SoC session
LCA14: LCA14-412: GPGPU on ARM SoC sessionLCA14: LCA14-412: GPGPU on ARM SoC session
LCA14: LCA14-412: GPGPU on ARM SoC session
 
containerd the universal container runtime
containerd the universal container runtimecontainerd the universal container runtime
containerd the universal container runtime
 

Plus de inside-BigData.com

Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...
inside-BigData.com
 
Transforming Private 5G Networks
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networks
inside-BigData.com
 
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
inside-BigData.com
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecasts
inside-BigData.com
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuning
inside-BigData.com
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Acceleration
inside-BigData.com
 
Introducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Cluster
inside-BigData.com
 

Plus de inside-BigData.com (20)

Major Market Shifts in IT
Major Market Shifts in ITMajor Market Shifts in IT
Major Market Shifts in IT
 
Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...
 
Transforming Private 5G Networks
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networks
 
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
 
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
 
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
 
HPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networks
 
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecasts
 
HPC AI Advisory Council Update
HPC AI Advisory Council UpdateHPC AI Advisory Council Update
HPC AI Advisory Council Update
 
Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuning
 
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODHPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
 
State of ARM-based HPC
State of ARM-based HPCState of ARM-based HPC
State of ARM-based HPC
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Acceleration
 
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
 
Scaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's EraScaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's Era
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computing
 
Introducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Cluster
 
Overview of HPC Interconnects
Overview of HPC InterconnectsOverview of HPC Interconnects
Overview of HPC Interconnects
 

Dernier

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Dernier (20)

Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

QNIBTerminal Plus InfiniBand - Containerized MPI Workloads

  • 1. QNIBTerminal plus InfiniBand Containerized MPI Workloads insideHPC Edition Slides slightly modified in comparison to the HPC Advisory Council Christian Kniep 2014-11-05
  • 2. Agenda • Docker in a Nutshell • QNIBTerminal • Testbed • MPI Benchmark • HPCG-Results • Future Work • Conclusion 2
  • 3. Docker in a Nutshell 3 • (chroot on steroids)2
  • 4. Docker in a Nutshell • Builds on-top LinuX Containers (LXC) • Kernel namespaces (isolation) • cgroups (resource mgmt) 4 • (chroot on steroids)2
  • 5. Docker in a Nutshell • intuitive build system 5 • (chroot on steroids)2 • Builds on-top LinuX Containers (LXC) • Kernel namespaces (isolation) • cgroups (resource mgmt)
  • 6. Docker in a Nutshell • intuitive build system • public repositories • RedHat backing 6 • (chroot on steroids)2 • Builds on-top LinuX Containers (LXC) • Kernel namespaces (isolation) • cgroups (resource mgmt)
  • 7. Traditional vs. Lightweight Layers 7 SERVICE SERVICE Userland (OS) Userland (OS) Userland (OS) KERNEL KERNEL HYPERVISOR HOST KERNEL SERVER SERVICE KERNEL Userland (OS) SERVICE SERVICE Userland (OS) Userland (OS) Userland (OS) HOST KERNEL SERVER SERVICE Userland (OS) Traditional Virtualisation Containerisation IB IB
  • 9. QNIBTerminal Motivation 9 Plain Log Events
  • 10. QNIBTerminal Motivation 10 Overlap Metrics/Log Events
  • 11. QNIBTerminal Overview 11 One Node Setup • All network traffic over bridge • Crippled MPI workload elk dns elasticsearch logstash kibana helixdns etcd haproxy haproxy Compute slurmd compute<N> slurmd compute0 slurmctld slurmctld grafana grafana graphite-api graphite-api graphite-web graphite-web carbon carbon Log/Events Services Performance
  • 12. • 3 containers on top (CentOS 6, CentOS 7, Ubuntu 12) • SLURM Resource Scheduler • 1 native partition • 3 containers partitions • Multiple Open MPI version installed • gcc versions Testbed 12 • 8 nodes (CentOS 7, 2x 4core XEON, 32GB, Mellanox ConnectX-2)
  • 13. MPI Benchmark • osu-micro-benchmarks-4.4.1 • osu_alltoall with two tasks on two hosts $ mpirun -np 2 -H venus001,venus002 $(pwd)/osu_alltoall # OSU MPI All-to-All Personalized Exchange Latency Test v4.4.1 # Size Avg Latency(us) 1 1.83 2 1.82 4 1.74 8 1.63 16 1.62 32 1.68 64 1.80 128 2.77 256 3.11 512 3.51 MPI benchmark was not in original HPC Advisory Council Presentation 13
  • 14. MPI Benchmark distribution’s results [2 task @2nodes] 14 latency [us] 5 4 3 2 1 0 native cos7 cos6 u12 4 8 16 32 64 128 256 512 1024 Message Size (KB) MPI benchmark was not in original HPC Advisory Council Presentation
  • 15. 15 latency [us] 2,8 2,1 1,4 0,7 0 MPI Benchmark Open MPI comparison [2 task @2nodes, avg(1B->64B)] distribution 1.5.4 1.6.4 1.8.3 native cos7 cos6 u12 oMPI 1.6.4 oMPI 1.6.4 oMPI 1.5.4 oMPI 1.5.4
  • 16. • mimics thermodynamic application workload • Linpack corrective / successor in the long-term? 16 HPCG Benchmark
  • 17. 17 HPCG Benchmark 6 5,25 GFLOP/s 3 4,5 3,75 distribution’s results native cos7 cos6 u12 CentOS 7.0 oMPI 1.6.4 gcc 4.8.2
  • 18. 18 HPCG Benchmark 6 5,25 GFLOP/s 3 4,5 3,75 native cos7 cos6 u12 CentOS 7.0 oMPI 1.6.4 gcc 4.8.2 CentOS 6.5 oMPI 1.5.4 gcc 4.4.7 Ubuntu12.04 oMPI 1.5.4 gcc 4.6.3 distribution’s results
  • 19. 6 5,25 HPCG Benchmark s GFLOP/4,5 3,75 3 distribution 19 native cos7 cos6 u12 oMPI 1.6.4 oMPI 1.6.4 oMPI 1.5.4 oMPI 1.5.4 Open MPI comparison
  • 20. 6 5,25 HPCG Benchmark s GFLOP/4,5 3,75 3 distribution 1.6.4 1.8.4 20 native cos7 cos6 u12 oMPI 1.6.4 oMPI 1.6.4 oMPI 1.5.4 oMPI 1.5.4 gcc 4.8.2 gcc 4.8.2 gcc 4.4.7 gcc 4.6.3 Open MPI comparison
  • 21. 6 5,25 HPCG Benchmark s GFLOP/4,5 3,75 3 distribution 1.5.4 1.6.4 1.8.4 21 native cos7 cos6 u12 oMPI 1.6.4 oMPI 1.6.4 oMPI 1.5.4 oMPI 1.5.4 gcc 4.8.2 gcc 4.8.2 gcc 4.4.7 gcc 4.6.3 Open MPI comparison
  • 22. Future Work • Compare with tuned bare-metal • Tune docker installation • Use of SV-IOR (Keynote earlier today) • Compare different frameworks to orchestrate • Security evaluations 22 • Benchmark real-world applications
  • 23. Conclusion • Abstraction bare-metal / application works fine • Bare-metal kernel provides access to IB • Container in charge from MPI upwards • Out-of-the-box: container beats bare-metal • Continuous testing/deployment of containerized workloads 23 • Bunch of tooling within docker ecosystem • Low performance overhead
  • 24. La Fin • Contact • @CQnib / @qnibinc • christian@qnib.org • http://qnib.org 24 https://www.flickr.com/photos/dharmabum1964/3108162671
  • 25. La Fin • Contact • @CQnib / @qnibinc • christian@qnib.org • http://qnib.org • Paper: http://doc.qnib.org/ 25 https://www.flickr.com/photos/dharmabum1964/3108162671
  • 26. La Fin 26 https://www.flickr.com/photos/dharmabum1964/3108162671 • Contact • @CQnib / @qnibinc • christian@qnib.org • http://qnib.org • Paper: http://doc.qnib.org/ • Interested? • Docker Pitch today • Internal Evaluations • Workshops / Talks
  • 27. La Fin 27 https://www.flickr.com/photos/dharmabum1964/3108162671 • Contact • @CQnib / @_qnib • christian@qnib.org • http://qnib.org • Paper: http://doc.qnib.org/ • Interested? • Docker Pitch today • Internal Evaluations • Workshops / Talks • Questions?