RWTH Aachen Compute Cluster is Intel Cluster Ready
1. science + computing ag
IT-Dienstleistungen und Software für anspruchsvolle Rechnernetze
Tübingen | München | Berlin | Düsseldorf
RWTH Aachen Compute Cluster is Intel Cluster Ready
Jan Wender
3. Seite 3
<Jan Wender> <RWTH Aachen Compute Cluster> | <2013-06-19>
Gliederungspunkt
science+computing at a glance
Founded in 1989
Sites Tübingen
München
Berlin
Düsseldorf
Employees 270
Shareholder Bull S.A. (100%)
Revenue 10/11 27 Mio. Euro
Partner Daikin Industries, Japan
NICE srl, Italien
Exa Corporation, USA
Platform Computing, Kanada
4. Seite 4
<Jan Wender> <RWTH Aachen Compute Cluster> | <2013-06-19>
Customers of s+c
Bremen, Hamburg
Beelen
Duisburg
Geschäftsstelle
Düsseldorf
Aachen
Alzenau
Zentrale
Tübingen
Stuttgart
Mannheim
Servicestandort
Frankfurt
Servicestandort
Ingolstadt
Wolfsburg
Köln
Geschäftsstelle
München
Geschäftsstelle
Berlin
13. Seite 13
<Jan Wender> <RWTH Aachen Compute Cluster> | <2013-06-19>
Deployment Process
14. Seite 14
<Jan Wender> <RWTH Aachen Compute Cluster> | <2013-06-19>
Factory Pre-Check
Setup headnode same as on site
Install all nodes rack per rack
Cluster-Check each rack before shipment
15. Seite 15
<Jan Wender> <RWTH Aachen Compute Cluster> | <2013-06-19>
On-Site Checks
Reuse factory checks
Check larger parts
Photograph:D
16. Seite 16
<Jan Wender> <RWTH Aachen Compute Cluster> | <2013-06-19>
Divide and Conquer
64 nodes: 0.58 hours
128 nodes: 0.78 hours
256 nodes: 1.57 hours
320 nodes: 1.92 hours
1700 nodes: ? Way too long!
http://software.intel.com/en-us/articles/intel-cluster-checker-18-execution-time/
17. Seite 17
<Jan Wender> <RWTH Aachen Compute Cluster> | <2013-06-19>
Divide and Conquer
http://software.intel.com/en-us/articles/partner-newsletter-Q2-2012-intel-cluster-ready-articles-2/#scaling
Certify sub-clusters
Certify some nodes from all sub-clusters
18. Seite 18
<Jan Wender> <RWTH Aachen Compute Cluster> | <2013-06-19>
Life is a Batch
http://software.intel.com/en-us/articles/partner-newsletter-Q2-2012-intel-cluster-ready-articles-3/#automating
#!/bin/bash #SBATCH -J ClusterChecker
#SBATCH -F /etc/intel/clck/nodelist
#SBATCH -t 10 #Max time, adjust for larger nodes
#SBATCH -p clck
#SBATCH --error=auto-clck.err –output=auto-clck.out
/usr/local/clck.sh
Use workload management system for user-friendly
checking and automation
Photograph:D
20. Seite 20
<Jan Wender> <RWTH Aachen Compute Cluster> | <2013-06-19>
Do
Check the cluster early
Check the cluster automatically
Use the batch system to coordinate with users jobs
Photograph:D