2. What is Blue Gene
A massively parallel supercomputer using
tens of thousands of embedded PowerPC
processors supporting a large memory space
With standard compilers
and message passing
environment
3. Why the name “Blue Gene”?
“Blue”: The corporate color of IBM
“Gene”: The intended use of the Blue Gene
clusters – Computational biology, specifically,
protein folding
4. History
Dec’99, IBM Research announced $100M US effort
to build a Petaflop scale supercomputer.
Two goals of The Blue Gene project :
– Massively parallel machine architecture and software
– Bio-Molecular Simulation – advance orders of magnitude
November 2001, Partnership with Lawrence
Livermore National Laboratory (LLNL)
and this resulted in …
6. Blue Gene Projects
Four Blue Gene projects :
– BlueGene/L
– BlueGene/C
– BlueGene/P
– BlueGene/Q
7. Blue Gene/L
The first computer in the Blue Gene
series
IBM first announced the Blue Gene/L
project, Sept. 29, 2004
Final configuration was launched in
October 2005
8. Blue Gene/L - Unsurpassed
Performance
Designed to deliver the most performance
per kilowatt of power consumed
Theoretical peak performance of 360
TFLOPS
Final Configuration (Oct. ‘05) scores over
280 TFLOPS sustained on the Linpack
benchmark.
Nov 14, ‘06, at Supercomputing 2006, Blue
Gene/L was awarded the winning prize in all
HPC Challenge Classes of awards.
9. Blue Gene/L Architecture
Can be scaled up to 65,536 compute or I/O
nodes, with 131,072 processors
Each node is a single ASIC with associated
DRAM memory chips
Each ASIC has 2 700 MHz IBM PowerPC
processors
PowerPC processors
– Low-frequency, low-power embedded processors,
superior to today's high-frequency, high-power
microprocessors by a factor of 2 or more
10. Blue Gene/L Architecture contd…
– Double-pipeline-double-precision Floating Point Unit
– A cache sub-system with built-in DRAM controller
Node CPUs are not cache coherent with one another
FPUs and CPUs are designed for low power
consumption
– Using transistors with low leakage current
– Local clock gating
– Putting the FPU or CPU/FPU pair to sleep
12. Blue Gene/L Architecture contd…
1 rack holds 1024 nodes or 2048 processors
Nodes optimized for low power consumption
ASIC based on System-on-a-chip technology
– Large numbers of low-power system-on-a-chip technology
allows it to outperform commodity clusters while saving on
power
– Aggressive packaging of processors, memory and
interconnect
– Power Efficient & Space Efficient
– Allows for latencies and bandwidths that are significantly
better than those for nodes typically used in ASC scale
supercomputers
13. Blue Gene/L Networks
Each
node is attached to 3 main parallel
communication networks
– 3D Torus network - peer-2-peer between compute
nodes
– Collective network – collective & global
communication
– Ethernet network - I/O and management (such as
access to any node for configuration, booting and
diagnostics )
14. Blue Gene/L System Software
System software supports efficient execution of
parallel applications
Compiler support for DFPU (C, C++, Fortran)
Compute nodes use a minimal operating system
called “BlueGene/L compute node kernel”
– A lightweight, single-user operating system
– Supports execution of a single dual-threaded application
compute process
– Kernel provides a single and static virtual address space to
one running compute process
– Because of single-process nature, no context switching
required
15. Blue Gene/L System Software contd…
To allow multiple programs to run concurrently
– Blue Gene/L system can be partitioned into electronically
isolated sets of nodes
– The number of nodes in a partition must be a positive
integer power of 2
– To run program – reserve this partition
– No other program can use till partition is done with current
program
– With so many nodes, component failures are inevitable. The
system is able to electrically isolate faulty hardware to allow
the machine to continue to run
16. Blue Gene/L System Software contd…
Parallel Programming model
– Message Passing – supported through an
implementation of MPI
– Only a subset of POSIX calls are supported
– Green threads are also used to simulate local
concurrency
17. Blue Gene/C
Sister-project to BlueGene/L
Renamed to Cyclops64
Massively parallel, supercomputer-on-a-chip
cellular architecture
Cellular architecture gives the programmer
the ability to run large numbers of concurrent
threads within a single processor.
18. Blue Gene/P
Architecturally
similar to BlueGene/L
Expected to operate around one petaflop
Expected around 2008
19. Blue Gene/Q
Last known supercomputer in the Blue Gene
series
Expected to reach 3-10 petaflops