Is the cloud relevant for high performance workloads ? We answer by sharing our experience : HPC consultants at ANEO have ported and optimized a distributed scientific software developed at Supelec, from their Linux cluster to Microsoft's new cloud technology, Big Compute (InfiniBand nodes interconnect).
2. Feedback on
Big Compute & HPC
on Windows Azure
Antoine Poliakov
HPC Consultant
ANEO
apoliakov@aneo.fr
http://blog.aneo.eu
Innovation Recherche
3. Introduction
HPC : a challenge for the cloud
•
Cloud : on-demand access through a telecommunications network to shared and userconfigurable IT resources
•
HPC (High Performance Computing) : a branch of computer science conercned with
maximizing software efficiency, in particular in terms of execution speed
–
–
–
Raw computing power doubles every 1.5 - 2 years
Network throughput doubles every 2 - 3 years
The compute/network gap doubles every 5 years
•
HPC in the cloud allows makes computing power accessible to all (SME, research labs,
etc.)
Fosters innovation
•
Our question : can the cloud offer sufficient performances for HPC workloads ?
–
–
–
#mstechdays
CPU : 100% native speed
RAM: 99% native speed
Network ???
#3
Innovation Recherche
4. Introduction
3 ingredients yield an answer through experimentation
Technology
HPC oriented
cloud
Use-case
HPC software
Experiments
State of the art of HPC in the cloud
#mstechdays
#4
Innovation Recherche
5. Introduction
Experimenting on HPC in the cloud : our approach
Identify technologies and partners
• HPC software use-case
• Efficient cloud computing service
Port the applicative HPC code : cluster cloud
• Skills improvements
• Feedback on the technologies
Experiment and measure performances
• Scaling
• Data transfers
#mstechdays
#5
Innovation Recherche
6. Introduction
A collaborative project with 3 complementary actors
Consulting firm: organization and
technologies
HPC Practice: fast/massive
information processing for finance
and industries
Established HPC research teams:
Distributed software & big data
Machine learning and interactive
systems
Windows Azure provides a cloud
solution aimed at HPC workloads:
Azure Big Compute
Goals
Identify most relevent use-cases
for our clients
Estimate the complexity of
porting and deploying an app
Evaluate if the solution is
production-ready
Goals
Is the cloud ready for scientific
computing ?
Specificities of deploying in the
cloud ?
Performances
Goals
Pre-release feedback
Inside view of a HPC
cluster cloud transition
#mstechdays
#6
Innovation Recherche
7. Introduction
Dedicated and competant teams: thank you all!
Consulting
Ported and deployed the
application in the cloud
Led the benchmarks
Constantinos Makassikis
HPC Consultant
Research
Use-case: distributed audio
segmentation
Experiments analysis
Stéphane Vialle
Professor,
Computer science
Antoine Poliakov
HPC Consultant
Stéphane Rossignol
Assistant Professor,
Signal processing
Wilfried Kirschenmann
HPC Consultant
Kévin Dehlinger
Computer scientist intern
CNAM
#mstechdays
#7
Provider
Created the technical solution
Made available notable
computational power
Innovation Recherche
Xavier Pillons
Principal Program Manager,
Windows Azure CAT
8. Presentation contents
1. Technical context
2. Feedback on porting the application
3. Optimizations
4. Results
#mstechdays
#8
Innovation Recherche
10. Azure Big Compute
Azure Big Compute = New Azure nodes + HPC Pack
New nodes: A8 and A9
•
•
•
•
2x8 snb E5-2670 @2.6Ghz, 112Gb DDR3 @1.6Ghz
InfiniBand (network direct @40Gbit/s): RDMA via MS-MPI @3.5Gb/s, 3µs
IP over Ethernet @10Gbit/s ; HDD 2Tb @250Mo/s
Azure hypervisor
HPC Pack
• Task scheduler middleware: Cluster Manager + SDK
• Tested with 50k cores in Azure
• Free Extension Pack : any Windows Server install can be a node
#mstechdays
#10
Innovation Recherche
11. Azure Big Compute
HPC Pack : on permise cluster
•
•
#mstechdays
N
N
N
N
N
N
N
N
Administration : hardware + software
N
N
M
N
Cluster dimensioned w.r.t. maximal workload
•
AD
Active Directory, Manager and nodes
in a privately managed infrastructure
N
#11
Innovation Recherche
12. Azure Big Compute
HPC Pack : in the Azure Big Compute cloud
•
Active Directory and manager in the cloud (VMs)
•
Nodes allocation and pricing on demand
•
Admin : software only
PaaS nodes
IaaS VM
Remote
desktop/CLI
#mstechdays
#12
M
Innovation Recherche
N
N
N
N
N
AD
N
N
N
N
N
N
N
13. Azure Big Compute
HPC Pack : hybrid deployment
•
Active Directory and manager on premise
•
Nodes both in the datacenter and in the cloud
•
Local dimensioning w.r.t. average load
Dynamic cloud dimensioning: absorbs peaks
•
Admin: software + hardware
N
N
N
N
N
N
N
N
N
N
#13
VPN
M
Innovation Recherche
N
N
N
N
N
AD
N
N
N
N
#mstechdays
N
N
N
N
N
14. ParSon
ParSon: an audio segmentation scientific software
• ParSon = audio segmentation algorithm : voice / music
1. Supervised training on known audio samples to calibrate the
classifier
2. Classification based on spectral analysis (FFT) on sliding windows
Digital audio
ParSon
Segmentation and classification
#mstechdays
#14
Innovation Recherche
voice
music
15. ParSon
ParSon is distributed with OpenMP + MPI
6. Get outputs
Data
Control
4. MPI Exec
2. Reserves
N computers
OAR
5. Tasks with
heavy intercommunications
1. Upload input
files
NAS
#mstechdays
#15
3. Input
deployment
Reserved computers
Innovation Recherche
Linux cluster
16. ParSon
Performances are limited by data transfers
Best runtime (s)
2048
512
128
IO bound
32
Nodes read from NAS
en réseau, à froid
Nodes read froid
en local, à locally
8
1
4
16
Number of nodes
#mstechdays
#16
Innovation Recherche
64
256
17. 2. PORTING THE APPLICATION
a. Porting C++ code: Linux Windows
b. Porting distribution strategy: Cluster HPC Cluster Manager
c. Porting and adapting deployment scripts
#mstechdays
#17
Innovation Recherche
18. Standards conformance = easy Linux Windows
porting
• ParSon and Visual conform to the C++ standard few code
changes
• Dependencies are the standard libraries and cross-platform
scientific libraries : libsnd, fftw
• Thanks to MS-MPI, inter-process communication code doesn’t
change
• Visual Studio natively supports OpenMP
• The only task left was translating build files:
Makefiles Visual C++ projects
#mstechdays
#18
Innovation Recherche
Porting
19. Porting
ParSon in the cluster
6. Get output
Data
Control
4. MPI Exec
2. Reserves
N computers
OAR
5. Run and
inter-com.
1. Upload input
file
NAS
#mstechdays
#19
3. Input
deployment
Reserved computers
Innovation Recherche
Linux cluster
20. Porting
ParSon dans le Cloud Azure
6. Get output
IaaS
PaaS
4. MPI Exec
2. Reserves
N nodes
HPC Cluster
Manager
AD
Domain
controller
5. Run and
inter-com.
1. Upload input
file
HPC
pack
SDK
Azure Storage
#mstechdays
#20
3. Input
deployment
Provisioned A9 nodes
Innovation Recherche
PaaS Big Compute
Data
Control
21. Porting
Deployment within Azure
At every software update : package + send in the cloud
1. Send to manager
–
–
Either with Azure Storage
Set-AzureStorageBlobContent Get-AzureStorageBlobContent
hpcpack create ; hpcpack upload hpcpack download
Or with normal transfert : internet accessible fileserver : FileZilla, etc.
2. Packaging script: mkdir, copy, etc. ; hpcpack create
3. Send to Azure storage: hpcpack upload
At every node provisioning : local copy
1. Remotely execute on nodes from the manager with clusrun
2. hpcpack download
3. powershell -command "Set-ExecutionPolicy RemoteSigned"
Invoke-Command -FilePath … -Credential …
Start-Process powershell -Verb runAs -ArgumentList …
4. Installation : %deployedPath%deployScript.ps1
#mstechdays
#21
Innovation Recherche
22. Porting
This first working setup has some limitations
• Transferring the input file is longer than sequential computation
on a single thread
• On many cores, computation times is negligible compared to
transfers
• WAV format headers and ParSon code limit input size to 4Gb
#mstechdays
#22
Innovation Recherche
24. Optimizations
Methodology : suppress the bottleneck
Identified bottleneck is the input file transfer
1. Disk write throughput: 300 Mb/s
We use a RAMFS
2. Accès Azure Storage : QoS 1.6 Gb/s
Download only once from the storage account, then broadcast through InfiniBand
3. Large input files: 60 Gb
FLAC c8 lossless compression halves size + not limited to 4Gb
Declare all counters as 64 bits ints in C++ code
#mstechdays
#24
Innovation Recherche
25. Optimizations
Accelerating local data access with a RAM filesystem
•
RAMFS = filesystem stored in a RAM block
–
–
•
ImDisk
–
–
•
Lightweight: driver + service + command line
Open-source but signed for Win64
Scripted silent install :
–
–
–
–
•
Very fast
Limited capacity, non persistent
hpcpack create …
rundll32 setupapi.dll,InstallHinfSection DefaultInstall 128 disk.inf
Start-Service -inputobject $(get-service -Name imdisk)
imdisk.exe -a -t vm -s 30G -m F: -o rw
format F: /fs:ntfs /x /q /Y
$acl = Get-Acl F:
$acl.AddAccessRule(…FileSystemAccessRule("Everyone","Write", …))
Set-Acl F: $acl
Run at every node provisioning
#mstechdays
#25
Innovation Recherche
26. Optimizations
Accelerating input file deployment
•
All standard transfer systems go through the Ethernet interface
– Azure Storage access via Azure and HPC Pack SDKs
– Windows share or CIFS network drive
– Standard file transfer protocols: FTP, NFS, etc.
•
The simplest way to leverage InfiniBand is through MPI
1. On one node: download the input file: Azure RAMFS
2. mpiexec broadcast.exe : 1 process per node
•
We developped a command line utility in C++ / MPI
•
If id = 0, reads RAMFS, by 4mb blocs and sends to other nodes through InfiniBand :
MPI_Bcast
•
If id ≠ 0, recieve data blocs and save them on RAMFS
•
Uses Win32 API: faster than standard library abstractions
3. Input data is in the RAM of all nodes, accessible as a file from the application
#mstechdays
#26
Innovation Recherche
28. Results
Computations scale well, especially for bigger files
Computation efficiency for different input sizes
Computation time (sec, log)
Real speedup / ideal speedup
Computation time scaling (log-log plot)
Number of cores (log)
Number of cores (log)
#mstechdays
#28
Innovation Recherche
29. Results
Input file transfer make global scaling worse
Efficiency for compute only and including transfers
Time decomposition, for an hour of input audio
+
-
Real speedup / ideal speedup
Time (sec, log)
Raw compute
Number of cores (log)
#mstechdays
#29
Number of cores (log)
Innovation Recherche
30. Broadcast time (sec, log)
Download time (min)
Consistent storage throughput (220Mb/s), latency may be high
Broadcast constant @700
Mb/sAsure storage download performances
Broadcast time scaling
Number of machines
File size (Gb)
#mstechdays
#30
Innovation Recherche
Results
32. Our feedback on the Big Compute technology
•
HPC standards conformance: C++, OpenMP,
MPI
–
•
•
Nodes administration
– Azure storage latency sometimes high
– Azure storage limited QoS users must
implement multiple account striping
– HDDs are slow (for HPC), even on A9
Ported in 10 work days
Compute: CPU, RAM
Network: InfiniBand between nodes
Reactive support
–
•
Data transfers
Solid performances
–
–
•
•
Community, Microsoft
– Nodes ↔ Manager transfers must go
through Azure storage: less convenient
than conventional remote file systems
Intuitive user interface
–
–
manage.windowsazure.com
HPC Cluster Manager
•
Everything is scriptable & programmable
•
Cloud is more flexible than cluster
•
Unified management of cloud and on-premise
#mstechdays
#32
•
Provisioning time must be taken into
account (~7min)
Innovation Recherche
33. Azure Big Compute for research and business
Predictable, pay what you use cost model
Modern design, extensive documentation, efficient support
Decreased need for administration – but still needed on the software side
For research
•
•
Access to compute without any barrier
paperwork, finance, etc.
•
A super computer for all, without investment
•
Elastic scaling : on-demand sizing
Start your workload in minutes
•
Interoperable with Windows clusters
– Cloud absorbs peaks
– Best of both worlds
•
Datacenters in UE : Ireland + Netherlands
–
•
For business
For squeezing a few more before the
(extended) deadline for that conference
Well suited to researchers in
distributed computing
–
Parametric experiments
#mstechdays
#33
Innovation Recherche
34. Thank you for your attention
•
Antoine Poliakov
apoliakov@aneo.fr
•
Stéphane Vialle
stephane.vialle@supelec.fr
•
ANEO
http://aneo.eu
http://blog.aneo.eu
•
Retrouvez nous aux TechDays !
Stand ANEO jeudi 11h30 - 13h
Au cœur du SI > Infrastructure moderne avec
Azure
#mstechdays
#34
Thanks
All our thanks to Microsoft
for lending us the nodes
?
A question : don’t hesitate!
Innovation Recherche