SlideShare une entreprise Scribd logo
1  sur  35
Feedback on
Big Compute & HPC
on Windows Azure
Antoine Poliakov
HPC Consultant
ANEO
apoliakov@aneo.fr
http://blog.aneo.eu

Innovation Recherche
Introduction

HPC : a challenge for the cloud
•

Cloud : on-demand access through a telecommunications network to shared and userconfigurable IT resources

•

HPC (High Performance Computing) : a branch of computer science conercned with
maximizing software efficiency, in particular in terms of execution speed
–
–
–

Raw computing power doubles every 1.5 - 2 years
Network throughput doubles every 2 - 3 years
The compute/network gap doubles every 5 years

•

HPC in the cloud allows makes computing power accessible to all (SME, research labs,
etc.)
 Fosters innovation

•

Our question : can the cloud offer sufficient performances for HPC workloads ?
–
–
–

#mstechdays

CPU : 100% native speed
RAM: 99% native speed
Network ???

#3

Innovation Recherche
Introduction

3 ingredients yield an answer through experimentation
Technology
HPC oriented
cloud

Use-case
HPC software

Experiments

State of the art of HPC in the cloud
#mstechdays

#4

Innovation Recherche
Introduction

Experimenting on HPC in the cloud : our approach
Identify technologies and partners
• HPC software use-case
• Efficient cloud computing service

Port the applicative HPC code : cluster  cloud
• Skills improvements
• Feedback on the technologies

Experiment and measure performances
• Scaling
• Data transfers
#mstechdays

#5

Innovation Recherche
Introduction

A collaborative project with 3 complementary actors
Consulting firm: organization and
technologies
 HPC Practice: fast/massive
information processing for finance
and industries

Established HPC research teams:
Distributed software & big data
Machine learning and interactive
systems

Windows Azure provides a cloud
solution aimed at HPC workloads:
Azure Big Compute

Goals
Identify most relevent use-cases
for our clients
Estimate the complexity of
porting and deploying an app
Evaluate if the solution is
production-ready

Goals
Is the cloud ready for scientific
computing ?
Specificities of deploying in the
cloud ?
Performances

Goals
Pre-release feedback
Inside view of a HPC
cluster  cloud transition

#mstechdays

#6

Innovation Recherche
Introduction

Dedicated and competant teams: thank you all!

Consulting
Ported and deployed the
application in the cloud
Led the benchmarks
Constantinos Makassikis
HPC Consultant

Research
Use-case: distributed audio
segmentation
Experiments analysis
Stéphane Vialle
Professor,
Computer science

Antoine Poliakov
HPC Consultant

Stéphane Rossignol
Assistant Professor,
Signal processing

Wilfried Kirschenmann
HPC Consultant

Kévin Dehlinger
Computer scientist intern
CNAM

#mstechdays

#7

Provider
Created the technical solution
Made available notable
computational power

Innovation Recherche

Xavier Pillons
Principal Program Manager,
Windows Azure CAT
Presentation contents
1. Technical context

2. Feedback on porting the application

3. Optimizations

4. Results

#mstechdays

#8

Innovation Recherche
1. TECHNICAL CONTEXT
a. Azure Big Compute
b. ParSon

#mstechdays

#9

Innovation Recherche
Azure Big Compute

Azure Big Compute = New Azure nodes + HPC Pack
New nodes: A8 and A9
•
•
•
•

2x8 snb E5-2670 @2.6Ghz, 112Gb DDR3 @1.6Ghz
InfiniBand (network direct @40Gbit/s): RDMA via MS-MPI @3.5Gb/s, 3µs
IP over Ethernet @10Gbit/s ; HDD 2Tb @250Mo/s
Azure hypervisor

HPC Pack
• Task scheduler middleware: Cluster Manager + SDK
• Tested with 50k cores in Azure
• Free Extension Pack : any Windows Server install can be a node
#mstechdays

#10

Innovation Recherche
Azure Big Compute

HPC Pack : on permise cluster

•
•

#mstechdays

N

N

N

N

N

N

N

N

Administration : hardware + software

N

N
M

N

Cluster dimensioned w.r.t. maximal workload

•
AD

Active Directory, Manager and nodes
in a privately managed infrastructure

N

#11

Innovation Recherche
Azure Big Compute

HPC Pack : in the Azure Big Compute cloud
•

Active Directory and manager in the cloud (VMs)

•

Nodes allocation and pricing on demand

•

Admin : software only

PaaS nodes
IaaS VM

Remote
desktop/CLI

#mstechdays

#12

M

Innovation Recherche

N

N

N

N

N

AD

N

N

N

N

N

N

N
Azure Big Compute

HPC Pack : hybrid deployment
•

Active Directory and manager on premise

•

Nodes both in the datacenter and in the cloud

•

Local dimensioning w.r.t. average load
Dynamic cloud dimensioning: absorbs peaks

•

Admin: software + hardware
N

N

N

N

N

N

N

N

N

N

#13

VPN

M

Innovation Recherche

N

N

N

N

N

AD

N

N

N

N

#mstechdays

N

N

N

N

N
ParSon

ParSon: an audio segmentation scientific software
• ParSon = audio segmentation algorithm : voice / music
1. Supervised training on known audio samples to calibrate the
classifier

2. Classification based on spectral analysis (FFT) on sliding windows
Digital audio

ParSon

Segmentation and classification

#mstechdays

#14

Innovation Recherche

voice
music
ParSon

ParSon is distributed with OpenMP + MPI
6. Get outputs

Data
Control

4. MPI Exec
2. Reserves
N computers

OAR

5. Tasks with
heavy intercommunications

1. Upload input
files

NAS

#mstechdays

#15

3. Input
deployment

Reserved computers

Innovation Recherche

Linux cluster
ParSon

Performances are limited by data transfers
Best runtime (s)

2048
512
128
IO bound

32

Nodes read from NAS
en réseau, à froid
Nodes read froid
en local, à locally

8
1

4

16
Number of nodes

#mstechdays

#16

Innovation Recherche

64

256
2. PORTING THE APPLICATION
a. Porting C++ code: Linux  Windows
b. Porting distribution strategy: Cluster  HPC Cluster Manager
c. Porting and adapting deployment scripts

#mstechdays

#17

Innovation Recherche
Standards conformance = easy Linux  Windows
porting
• ParSon and Visual conform to the C++ standard  few code
changes

• Dependencies are the standard libraries and cross-platform
scientific libraries : libsnd, fftw
• Thanks to MS-MPI, inter-process communication code doesn’t
change
• Visual Studio natively supports OpenMP
• The only task left was translating build files:
Makefiles  Visual C++ projects
#mstechdays

#18

Innovation Recherche

Porting
Porting

ParSon in the cluster
6. Get output

Data
Control

4. MPI Exec
2. Reserves
N computers

OAR

5. Run and
inter-com.

1. Upload input
file

NAS

#mstechdays

#19

3. Input
deployment

Reserved computers

Innovation Recherche

Linux cluster
Porting

ParSon dans le Cloud Azure
6. Get output

IaaS

PaaS

4. MPI Exec
2. Reserves
N nodes
HPC Cluster
Manager

AD
Domain
controller
5. Run and
inter-com.

1. Upload input
file
HPC
pack
SDK

Azure Storage

#mstechdays

#20

3. Input
deployment

Provisioned A9 nodes

Innovation Recherche

PaaS Big Compute

Data
Control
Porting

Deployment within Azure
At every software update : package + send in the cloud
1. Send to manager
–
–

Either with Azure Storage
Set-AzureStorageBlobContent  Get-AzureStorageBlobContent
hpcpack create ; hpcpack upload  hpcpack download
Or with normal transfert : internet accessible fileserver : FileZilla, etc.

2. Packaging script: mkdir, copy, etc. ; hpcpack create
3. Send to Azure storage: hpcpack upload
At every node provisioning : local copy
1. Remotely execute on nodes from the manager with clusrun
2. hpcpack download
3. powershell -command "Set-ExecutionPolicy RemoteSigned"
Invoke-Command -FilePath … -Credential …
Start-Process powershell -Verb runAs -ArgumentList …
4. Installation : %deployedPath%deployScript.ps1
#mstechdays

#21

Innovation Recherche
Porting

This first working setup has some limitations
• Transferring the input file is longer than sequential computation
on a single thread

• On many cores, computation times is negligible compared to
transfers
• WAV format headers and ParSon code limit input size to 4Gb
#mstechdays

#22

Innovation Recherche
3. OPTIMIZATIONS

#mstechdays

#23

Innovation Recherche
Optimizations

Methodology : suppress the bottleneck
Identified bottleneck is the input file transfer
1. Disk write throughput: 300 Mb/s
 We use a RAMFS
2. Accès Azure Storage : QoS 1.6 Gb/s
 Download only once from the storage account, then broadcast through InfiniBand
3. Large input files: 60 Gb
 FLAC c8 lossless compression halves size + not limited to 4Gb
 Declare all counters as 64 bits ints in C++ code
#mstechdays

#24

Innovation Recherche
Optimizations

Accelerating local data access with a RAM filesystem
•

RAMFS = filesystem stored in a RAM block
–
–

•

ImDisk
–
–

•

Lightweight: driver + service + command line
Open-source but signed for Win64

Scripted silent install :
–
–
–
–

•

Very fast
Limited capacity, non persistent

hpcpack create …
rundll32 setupapi.dll,InstallHinfSection DefaultInstall 128 disk.inf
Start-Service -inputobject $(get-service -Name imdisk)
imdisk.exe -a -t vm -s 30G -m F: -o rw
format F: /fs:ntfs /x /q /Y
$acl = Get-Acl F:
$acl.AddAccessRule(…FileSystemAccessRule("Everyone","Write", …))
Set-Acl F: $acl

Run at every node provisioning

#mstechdays

#25

Innovation Recherche
Optimizations

Accelerating input file deployment
•

All standard transfer systems go through the Ethernet interface
– Azure Storage access via Azure and HPC Pack SDKs
– Windows share or CIFS network drive

– Standard file transfer protocols: FTP, NFS, etc.

•

The simplest way to leverage InfiniBand is through MPI
1. On one node: download the input file: Azure  RAMFS
2. mpiexec broadcast.exe : 1 process per node
•

We developped a command line utility in C++ / MPI

•

If id = 0, reads RAMFS, by 4mb blocs and sends to other nodes through InfiniBand :
MPI_Bcast

•

If id ≠ 0, recieve data blocs and save them on RAMFS

•

Uses Win32 API: faster than standard library abstractions

3. Input data is in the RAM of all nodes, accessible as a file from the application
#mstechdays

#26

Innovation Recherche
4. RESULTS

#mstechdays

#27

Innovation Recherche
Results

Computations scale well, especially for bigger files
Computation efficiency for different input sizes

Computation time (sec, log)

Real speedup / ideal speedup

Computation time scaling (log-log plot)

Number of cores (log)

Number of cores (log)

#mstechdays

#28

Innovation Recherche
Results

Input file transfer make global scaling worse
Efficiency for compute only and including transfers

Time decomposition, for an hour of input audio

+

-

Real speedup / ideal speedup

Time (sec, log)

Raw compute

Number of cores (log)

#mstechdays

#29

Number of cores (log)

Innovation Recherche
Broadcast time (sec, log)

Download time (min)

Consistent storage throughput (220Mb/s), latency may be high
Broadcast constant @700
Mb/sAsure storage download performances
Broadcast time scaling

Number of machines

File size (Gb)

#mstechdays

#30

Innovation Recherche

Results
5. CONCLUSION

#mstechdays

#31

Innovation Recherche
Our feedback on the Big Compute technology
•

HPC standards conformance: C++, OpenMP,
MPI
–

•

•

Nodes administration

– Azure storage latency sometimes high
– Azure storage limited QoS  users must
implement multiple account striping
– HDDs are slow (for HPC), even on A9

Ported in 10 work days

Compute: CPU, RAM
Network: InfiniBand between nodes

Reactive support
–

•

Data transfers

Solid performances
–
–

•

•

Community, Microsoft

– Nodes ↔ Manager transfers must go
through Azure storage: less convenient
than conventional remote file systems

Intuitive user interface
–
–

manage.windowsazure.com
HPC Cluster Manager

•

Everything is scriptable & programmable

•

Cloud is more flexible than cluster

•

Unified management of cloud and on-premise
#mstechdays

#32

•

Provisioning time must be taken into
account (~7min)

Innovation Recherche
Azure Big Compute for research and business
Predictable, pay what you use cost model
Modern design, extensive documentation, efficient support
Decreased need for administration – but still needed on the software side

For research
•

•

Access to compute without any barrier
paperwork, finance, etc.

•

A super computer for all, without investment

•

Elastic scaling : on-demand sizing

Start your workload in minutes

•

Interoperable with Windows clusters
– Cloud absorbs peaks
– Best of both worlds

•

Datacenters in UE : Ireland + Netherlands

–

•

For business

For squeezing a few more before the
(extended) deadline for that conference 

Well suited to researchers in
distributed computing
–

Parametric experiments

#mstechdays

#33

Innovation Recherche
Thank you for your attention
•

Antoine Poliakov
apoliakov@aneo.fr

•

Stéphane Vialle
stephane.vialle@supelec.fr

•

ANEO
http://aneo.eu
http://blog.aneo.eu

•

Retrouvez nous aux TechDays !
Stand ANEO jeudi 11h30 - 13h
Au cœur du SI > Infrastructure moderne avec
Azure
#mstechdays

#34

Thanks

All our thanks to Microsoft
for lending us the nodes

?

A question : don’t hesitate!

Innovation Recherche
Digital is
business

Contenu connexe

Tendances

Streaming Sensor Data Slides_Virender
Streaming Sensor Data Slides_VirenderStreaming Sensor Data Slides_Virender
Streaming Sensor Data Slides_Virender
vithakur
 

Tendances (20)

Webinar | Better Together: Apache Cassandra and Apache Kafka
Webinar  |  Better Together: Apache Cassandra and Apache KafkaWebinar  |  Better Together: Apache Cassandra and Apache Kafka
Webinar | Better Together: Apache Cassandra and Apache Kafka
 
Capital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream ProcessingCapital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream Processing
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 
Data Streaming with Apache Kafka & MongoDB - EMEA
Data Streaming with Apache Kafka & MongoDB - EMEAData Streaming with Apache Kafka & MongoDB - EMEA
Data Streaming with Apache Kafka & MongoDB - EMEA
 
Introduction to Apache Kafka and Confluent... and why they matter
Introduction to Apache Kafka and Confluent... and why they matterIntroduction to Apache Kafka and Confluent... and why they matter
Introduction to Apache Kafka and Confluent... and why they matter
 
8 Lessons Learned from Using Kafka in 1500 microservices - confluent streamin...
8 Lessons Learned from Using Kafka in 1500 microservices - confluent streamin...8 Lessons Learned from Using Kafka in 1500 microservices - confluent streamin...
8 Lessons Learned from Using Kafka in 1500 microservices - confluent streamin...
 
Streaming Sensor Data Slides_Virender
Streaming Sensor Data Slides_VirenderStreaming Sensor Data Slides_Virender
Streaming Sensor Data Slides_Virender
 
dotScale 2017 Keynote: The Rise of Real Time by Neha Narkhede
dotScale 2017 Keynote: The Rise of Real Time by Neha NarkhededotScale 2017 Keynote: The Rise of Real Time by Neha Narkhede
dotScale 2017 Keynote: The Rise of Real Time by Neha Narkhede
 
Hands On With Spark: Creating A Fast Data Pipeline With Structured Streaming ...
Hands On With Spark: Creating A Fast Data Pipeline With Structured Streaming ...Hands On With Spark: Creating A Fast Data Pipeline With Structured Streaming ...
Hands On With Spark: Creating A Fast Data Pipeline With Structured Streaming ...
 
Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...
Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...
Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...
 
Writing Blazing Fast, and Production-Ready Kafka Streams apps in less than 30...
Writing Blazing Fast, and Production-Ready Kafka Streams apps in less than 30...Writing Blazing Fast, and Production-Ready Kafka Streams apps in less than 30...
Writing Blazing Fast, and Production-Ready Kafka Streams apps in less than 30...
 
Real-Time Data Pipelines with Kafka, Spark, and Operational Databases
Real-Time Data Pipelines with Kafka, Spark, and Operational DatabasesReal-Time Data Pipelines with Kafka, Spark, and Operational Databases
Real-Time Data Pipelines with Kafka, Spark, and Operational Databases
 
KSQL - Stream Processing simplified!
KSQL - Stream Processing simplified!KSQL - Stream Processing simplified!
KSQL - Stream Processing simplified!
 
Kafka for data scientists
Kafka for data scientistsKafka for data scientists
Kafka for data scientists
 
Kafka On YARN (KOYA): An Open Source Initiative to integrate Kafka & YARN
Kafka On YARN (KOYA): An Open Source Initiative to integrate Kafka & YARNKafka On YARN (KOYA): An Open Source Initiative to integrate Kafka & YARN
Kafka On YARN (KOYA): An Open Source Initiative to integrate Kafka & YARN
 
Hadoop made fast - Why Virtual Reality Needed Stream Processing to Survive
Hadoop made fast - Why Virtual Reality Needed Stream Processing to SurviveHadoop made fast - Why Virtual Reality Needed Stream Processing to Survive
Hadoop made fast - Why Virtual Reality Needed Stream Processing to Survive
 
Kafka Summit NYC 2017 - Cloud Native Data Streaming Microservices with Spring...
Kafka Summit NYC 2017 - Cloud Native Data Streaming Microservices with Spring...Kafka Summit NYC 2017 - Cloud Native Data Streaming Microservices with Spring...
Kafka Summit NYC 2017 - Cloud Native Data Streaming Microservices with Spring...
 
How Disney+ uses fast data ubiquity to improve the customer experience
 How Disney+ uses fast data ubiquity to improve the customer experience  How Disney+ uses fast data ubiquity to improve the customer experience
How Disney+ uses fast data ubiquity to improve the customer experience
 
APAC Kafka Summit - Best Of
APAC Kafka Summit - Best Of APAC Kafka Summit - Best Of
APAC Kafka Summit - Best Of
 
Simplify Governance of Streaming Data
Simplify Governance of Streaming Data Simplify Governance of Streaming Data
Simplify Governance of Streaming Data
 

Similaire à Feedback on Big Compute & HPC on Windows Azure

The Computing Continuum.pdf
The Computing Continuum.pdfThe Computing Continuum.pdf
The Computing Continuum.pdf
Förderverein Technische Fakultät
 

Similaire à Feedback on Big Compute & HPC on Windows Azure (20)

Feedback on Big Compute & HPC on Windows Azure
Feedback on Big Compute & HPC on Windows AzureFeedback on Big Compute & HPC on Windows Azure
Feedback on Big Compute & HPC on Windows Azure
 
BPF & Cilium - Turning Linux into a Microservices-aware Operating System
BPF  & Cilium - Turning Linux into a Microservices-aware Operating SystemBPF  & Cilium - Turning Linux into a Microservices-aware Operating System
BPF & Cilium - Turning Linux into a Microservices-aware Operating System
 
uCluster
uClusteruCluster
uCluster
 
Exploring the Programming Models for the LUMI Supercomputer
Exploring the Programming Models for the LUMI Supercomputer Exploring the Programming Models for the LUMI Supercomputer
Exploring the Programming Models for the LUMI Supercomputer
 
FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)
 
FD.IO Vector Packet Processing
FD.IO Vector Packet ProcessingFD.IO Vector Packet Processing
FD.IO Vector Packet Processing
 
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageWebinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
 
Dockerizing Aurea - Docker Con EU 2017
Dockerizing Aurea - Docker Con EU 2017Dockerizing Aurea - Docker Con EU 2017
Dockerizing Aurea - Docker Con EU 2017
 
Building Embedded Linux Full Tutorial for ARM
Building Embedded Linux Full Tutorial for ARMBuilding Embedded Linux Full Tutorial for ARM
Building Embedded Linux Full Tutorial for ARM
 
Introduction to HPC & Supercomputing in AI
Introduction to HPC & Supercomputing in AIIntroduction to HPC & Supercomputing in AI
Introduction to HPC & Supercomputing in AI
 
The Computing Continuum.pdf
The Computing Continuum.pdfThe Computing Continuum.pdf
The Computing Continuum.pdf
 
SoC Solutions Enabling Server-Based Networking
SoC Solutions Enabling Server-Based NetworkingSoC Solutions Enabling Server-Based Networking
SoC Solutions Enabling Server-Based Networking
 
Performance Optimization of SPH Algorithms for Multi/Many-Core Architectures
Performance Optimization of SPH Algorithms for Multi/Many-Core ArchitecturesPerformance Optimization of SPH Algorithms for Multi/Many-Core Architectures
Performance Optimization of SPH Algorithms for Multi/Many-Core Architectures
 
Unleash the Power of Open Networking
Unleash the Power of Open NetworkingUnleash the Power of Open Networking
Unleash the Power of Open Networking
 
Software Network Data Plane - Satisfying the need for speed - FD.io - VPP and...
Software Network Data Plane - Satisfying the need for speed - FD.io - VPP and...Software Network Data Plane - Satisfying the need for speed - FD.io - VPP and...
Software Network Data Plane - Satisfying the need for speed - FD.io - VPP and...
 
Accelerate Big Data Processing with High-Performance Computing Technologies
Accelerate Big Data Processing with High-Performance Computing TechnologiesAccelerate Big Data Processing with High-Performance Computing Technologies
Accelerate Big Data Processing with High-Performance Computing Technologies
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDK
 
ERTS 2008 - Using Linux for industrial projects
ERTS 2008 - Using Linux for industrial projectsERTS 2008 - Using Linux for industrial projects
ERTS 2008 - Using Linux for industrial projects
 
Seminar Accelerating Business Using Microservices Architecture in Digital Age...
Seminar Accelerating Business Using Microservices Architecture in Digital Age...Seminar Accelerating Business Using Microservices Architecture in Digital Age...
Seminar Accelerating Business Using Microservices Architecture in Digital Age...
 
HPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyHPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journey
 

Dernier

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Dernier (20)

Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 

Feedback on Big Compute & HPC on Windows Azure

  • 1.
  • 2. Feedback on Big Compute & HPC on Windows Azure Antoine Poliakov HPC Consultant ANEO apoliakov@aneo.fr http://blog.aneo.eu Innovation Recherche
  • 3. Introduction HPC : a challenge for the cloud • Cloud : on-demand access through a telecommunications network to shared and userconfigurable IT resources • HPC (High Performance Computing) : a branch of computer science conercned with maximizing software efficiency, in particular in terms of execution speed – – – Raw computing power doubles every 1.5 - 2 years Network throughput doubles every 2 - 3 years The compute/network gap doubles every 5 years • HPC in the cloud allows makes computing power accessible to all (SME, research labs, etc.)  Fosters innovation • Our question : can the cloud offer sufficient performances for HPC workloads ? – – – #mstechdays CPU : 100% native speed RAM: 99% native speed Network ??? #3 Innovation Recherche
  • 4. Introduction 3 ingredients yield an answer through experimentation Technology HPC oriented cloud Use-case HPC software Experiments State of the art of HPC in the cloud #mstechdays #4 Innovation Recherche
  • 5. Introduction Experimenting on HPC in the cloud : our approach Identify technologies and partners • HPC software use-case • Efficient cloud computing service Port the applicative HPC code : cluster  cloud • Skills improvements • Feedback on the technologies Experiment and measure performances • Scaling • Data transfers #mstechdays #5 Innovation Recherche
  • 6. Introduction A collaborative project with 3 complementary actors Consulting firm: organization and technologies  HPC Practice: fast/massive information processing for finance and industries Established HPC research teams: Distributed software & big data Machine learning and interactive systems Windows Azure provides a cloud solution aimed at HPC workloads: Azure Big Compute Goals Identify most relevent use-cases for our clients Estimate the complexity of porting and deploying an app Evaluate if the solution is production-ready Goals Is the cloud ready for scientific computing ? Specificities of deploying in the cloud ? Performances Goals Pre-release feedback Inside view of a HPC cluster  cloud transition #mstechdays #6 Innovation Recherche
  • 7. Introduction Dedicated and competant teams: thank you all! Consulting Ported and deployed the application in the cloud Led the benchmarks Constantinos Makassikis HPC Consultant Research Use-case: distributed audio segmentation Experiments analysis Stéphane Vialle Professor, Computer science Antoine Poliakov HPC Consultant Stéphane Rossignol Assistant Professor, Signal processing Wilfried Kirschenmann HPC Consultant Kévin Dehlinger Computer scientist intern CNAM #mstechdays #7 Provider Created the technical solution Made available notable computational power Innovation Recherche Xavier Pillons Principal Program Manager, Windows Azure CAT
  • 8. Presentation contents 1. Technical context 2. Feedback on porting the application 3. Optimizations 4. Results #mstechdays #8 Innovation Recherche
  • 9. 1. TECHNICAL CONTEXT a. Azure Big Compute b. ParSon #mstechdays #9 Innovation Recherche
  • 10. Azure Big Compute Azure Big Compute = New Azure nodes + HPC Pack New nodes: A8 and A9 • • • • 2x8 snb E5-2670 @2.6Ghz, 112Gb DDR3 @1.6Ghz InfiniBand (network direct @40Gbit/s): RDMA via MS-MPI @3.5Gb/s, 3µs IP over Ethernet @10Gbit/s ; HDD 2Tb @250Mo/s Azure hypervisor HPC Pack • Task scheduler middleware: Cluster Manager + SDK • Tested with 50k cores in Azure • Free Extension Pack : any Windows Server install can be a node #mstechdays #10 Innovation Recherche
  • 11. Azure Big Compute HPC Pack : on permise cluster • • #mstechdays N N N N N N N N Administration : hardware + software N N M N Cluster dimensioned w.r.t. maximal workload • AD Active Directory, Manager and nodes in a privately managed infrastructure N #11 Innovation Recherche
  • 12. Azure Big Compute HPC Pack : in the Azure Big Compute cloud • Active Directory and manager in the cloud (VMs) • Nodes allocation and pricing on demand • Admin : software only PaaS nodes IaaS VM Remote desktop/CLI #mstechdays #12 M Innovation Recherche N N N N N AD N N N N N N N
  • 13. Azure Big Compute HPC Pack : hybrid deployment • Active Directory and manager on premise • Nodes both in the datacenter and in the cloud • Local dimensioning w.r.t. average load Dynamic cloud dimensioning: absorbs peaks • Admin: software + hardware N N N N N N N N N N #13 VPN M Innovation Recherche N N N N N AD N N N N #mstechdays N N N N N
  • 14. ParSon ParSon: an audio segmentation scientific software • ParSon = audio segmentation algorithm : voice / music 1. Supervised training on known audio samples to calibrate the classifier 2. Classification based on spectral analysis (FFT) on sliding windows Digital audio ParSon Segmentation and classification #mstechdays #14 Innovation Recherche voice music
  • 15. ParSon ParSon is distributed with OpenMP + MPI 6. Get outputs Data Control 4. MPI Exec 2. Reserves N computers OAR 5. Tasks with heavy intercommunications 1. Upload input files NAS #mstechdays #15 3. Input deployment Reserved computers Innovation Recherche Linux cluster
  • 16. ParSon Performances are limited by data transfers Best runtime (s) 2048 512 128 IO bound 32 Nodes read from NAS en réseau, à froid Nodes read froid en local, à locally 8 1 4 16 Number of nodes #mstechdays #16 Innovation Recherche 64 256
  • 17. 2. PORTING THE APPLICATION a. Porting C++ code: Linux  Windows b. Porting distribution strategy: Cluster  HPC Cluster Manager c. Porting and adapting deployment scripts #mstechdays #17 Innovation Recherche
  • 18. Standards conformance = easy Linux  Windows porting • ParSon and Visual conform to the C++ standard  few code changes • Dependencies are the standard libraries and cross-platform scientific libraries : libsnd, fftw • Thanks to MS-MPI, inter-process communication code doesn’t change • Visual Studio natively supports OpenMP • The only task left was translating build files: Makefiles  Visual C++ projects #mstechdays #18 Innovation Recherche Porting
  • 19. Porting ParSon in the cluster 6. Get output Data Control 4. MPI Exec 2. Reserves N computers OAR 5. Run and inter-com. 1. Upload input file NAS #mstechdays #19 3. Input deployment Reserved computers Innovation Recherche Linux cluster
  • 20. Porting ParSon dans le Cloud Azure 6. Get output IaaS PaaS 4. MPI Exec 2. Reserves N nodes HPC Cluster Manager AD Domain controller 5. Run and inter-com. 1. Upload input file HPC pack SDK Azure Storage #mstechdays #20 3. Input deployment Provisioned A9 nodes Innovation Recherche PaaS Big Compute Data Control
  • 21. Porting Deployment within Azure At every software update : package + send in the cloud 1. Send to manager – – Either with Azure Storage Set-AzureStorageBlobContent  Get-AzureStorageBlobContent hpcpack create ; hpcpack upload  hpcpack download Or with normal transfert : internet accessible fileserver : FileZilla, etc. 2. Packaging script: mkdir, copy, etc. ; hpcpack create 3. Send to Azure storage: hpcpack upload At every node provisioning : local copy 1. Remotely execute on nodes from the manager with clusrun 2. hpcpack download 3. powershell -command "Set-ExecutionPolicy RemoteSigned" Invoke-Command -FilePath … -Credential … Start-Process powershell -Verb runAs -ArgumentList … 4. Installation : %deployedPath%deployScript.ps1 #mstechdays #21 Innovation Recherche
  • 22. Porting This first working setup has some limitations • Transferring the input file is longer than sequential computation on a single thread • On many cores, computation times is negligible compared to transfers • WAV format headers and ParSon code limit input size to 4Gb #mstechdays #22 Innovation Recherche
  • 24. Optimizations Methodology : suppress the bottleneck Identified bottleneck is the input file transfer 1. Disk write throughput: 300 Mb/s  We use a RAMFS 2. Accès Azure Storage : QoS 1.6 Gb/s  Download only once from the storage account, then broadcast through InfiniBand 3. Large input files: 60 Gb  FLAC c8 lossless compression halves size + not limited to 4Gb  Declare all counters as 64 bits ints in C++ code #mstechdays #24 Innovation Recherche
  • 25. Optimizations Accelerating local data access with a RAM filesystem • RAMFS = filesystem stored in a RAM block – – • ImDisk – – • Lightweight: driver + service + command line Open-source but signed for Win64 Scripted silent install : – – – – • Very fast Limited capacity, non persistent hpcpack create … rundll32 setupapi.dll,InstallHinfSection DefaultInstall 128 disk.inf Start-Service -inputobject $(get-service -Name imdisk) imdisk.exe -a -t vm -s 30G -m F: -o rw format F: /fs:ntfs /x /q /Y $acl = Get-Acl F: $acl.AddAccessRule(…FileSystemAccessRule("Everyone","Write", …)) Set-Acl F: $acl Run at every node provisioning #mstechdays #25 Innovation Recherche
  • 26. Optimizations Accelerating input file deployment • All standard transfer systems go through the Ethernet interface – Azure Storage access via Azure and HPC Pack SDKs – Windows share or CIFS network drive – Standard file transfer protocols: FTP, NFS, etc. • The simplest way to leverage InfiniBand is through MPI 1. On one node: download the input file: Azure  RAMFS 2. mpiexec broadcast.exe : 1 process per node • We developped a command line utility in C++ / MPI • If id = 0, reads RAMFS, by 4mb blocs and sends to other nodes through InfiniBand : MPI_Bcast • If id ≠ 0, recieve data blocs and save them on RAMFS • Uses Win32 API: faster than standard library abstractions 3. Input data is in the RAM of all nodes, accessible as a file from the application #mstechdays #26 Innovation Recherche
  • 28. Results Computations scale well, especially for bigger files Computation efficiency for different input sizes Computation time (sec, log) Real speedup / ideal speedup Computation time scaling (log-log plot) Number of cores (log) Number of cores (log) #mstechdays #28 Innovation Recherche
  • 29. Results Input file transfer make global scaling worse Efficiency for compute only and including transfers Time decomposition, for an hour of input audio + - Real speedup / ideal speedup Time (sec, log) Raw compute Number of cores (log) #mstechdays #29 Number of cores (log) Innovation Recherche
  • 30. Broadcast time (sec, log) Download time (min) Consistent storage throughput (220Mb/s), latency may be high Broadcast constant @700 Mb/sAsure storage download performances Broadcast time scaling Number of machines File size (Gb) #mstechdays #30 Innovation Recherche Results
  • 32. Our feedback on the Big Compute technology • HPC standards conformance: C++, OpenMP, MPI – • • Nodes administration – Azure storage latency sometimes high – Azure storage limited QoS  users must implement multiple account striping – HDDs are slow (for HPC), even on A9 Ported in 10 work days Compute: CPU, RAM Network: InfiniBand between nodes Reactive support – • Data transfers Solid performances – – • • Community, Microsoft – Nodes ↔ Manager transfers must go through Azure storage: less convenient than conventional remote file systems Intuitive user interface – – manage.windowsazure.com HPC Cluster Manager • Everything is scriptable & programmable • Cloud is more flexible than cluster • Unified management of cloud and on-premise #mstechdays #32 • Provisioning time must be taken into account (~7min) Innovation Recherche
  • 33. Azure Big Compute for research and business Predictable, pay what you use cost model Modern design, extensive documentation, efficient support Decreased need for administration – but still needed on the software side For research • • Access to compute without any barrier paperwork, finance, etc. • A super computer for all, without investment • Elastic scaling : on-demand sizing Start your workload in minutes • Interoperable with Windows clusters – Cloud absorbs peaks – Best of both worlds • Datacenters in UE : Ireland + Netherlands – • For business For squeezing a few more before the (extended) deadline for that conference  Well suited to researchers in distributed computing – Parametric experiments #mstechdays #33 Innovation Recherche
  • 34. Thank you for your attention • Antoine Poliakov apoliakov@aneo.fr • Stéphane Vialle stephane.vialle@supelec.fr • ANEO http://aneo.eu http://blog.aneo.eu • Retrouvez nous aux TechDays ! Stand ANEO jeudi 11h30 - 13h Au cœur du SI > Infrastructure moderne avec Azure #mstechdays #34 Thanks All our thanks to Microsoft for lending us the nodes ? A question : don’t hesitate! Innovation Recherche