SlideShare a Scribd company logo
1 of 21
1b.1
Types of Parallel Computers
Two principal approaches:
• Shared memory multiprocessor
• Distributed memory multicomputer
ITCS 4/5145 Parallel Programming, UNC-Charlotte, B. Wilkinson, 2010. Aug 26, 2010
1b.2
Shared Memory
Multiprocessor
1b.3
Conventional Computer
Consists of a processor executing a program stored in a
(main) memory:
Each main memory location located by its address.
Addresses start at 0 and extend to 2b
- 1 when there are
b bits (binary digits) in address.
Main memory
Processor
Instructions (to processor)
Data (to or from processor)
1b.4
Shared Memory Multiprocessor System
Natural way to extend single processor model - have multiple
processors connected to multiple memory modules, such that
each processor can access any memory module:
Processors
Processor-memory
Interconnections
Memory module
One
address
space
1b.5
Simplistic view of a small shared memory
multiprocessor
Examples:
• Dual Pentiums
• Quad Pentiums
Processors Shared memory
Bus
1b.6
Real computer system have cache memory between the main
memory and processors. Level 1 (L1) cache and Level 2 (L2) cache.
Example Quad Shared Memory Multiprocessor
Processor
L2 Cache
Bus interface
L1 cache
Processor
L2 Cache
Bus interface
L1 cache
Processor
L2 Cache
Bus interface
L1 cache
Processor
L2 Cache
Bus interface
L1 cache
Memory controller
Memory
Processor/
memory
bus
Shared memory
1b.7
“Recent” innovation
• Dual-core and multi-core processors
• Two or more independent processors in one
package
• Actually an old idea but not put into wide practice
until recently.
• Since L1 cache is usually inside package and L2
cache outside package, dual-/multi-core processors
usually share L2 cache.
1b.8
Single quad core shared memory
multiprocessor
L2 Cache
Memory controller
Memory
Shared memory
Chip
Processor
L1 cache
Processor
L1 cache
Processor
L1 cache
Processor
L1 cache
1b.9
Examples
• Intel:
– Core Dual processors -- Two processors in one package
sharing a common L2 Cache. 2005-2006
– Intel Core 2 family dual cores, with quad core from Nov
2006 onwards
– Core i7 processors replacing Core 2 family - Quad core
Nov 2008
– Intel Teraflops Research Chip (Polaris), a 3.16 GHz, 80-
core processor prototype.
• Xbox 360 game console -- triple core PowerPC
microprocessor.
• PlayStation 3 Cell processor -- 9 core design.
References and more information -- wikipedia
1b.10
Multiple quad-core multiprocessors
(example coit-grid05.uncc.edu)
Memory controller
Memory
Shared memory
L2 Cache
possible L3 cache
Processor
L1 cache
Processor
L1 cache
Processor
L1 cache
Processor
L1 cache
Processor
L1 cache
Processor
L1 cache
Processor
L1 cache
Processor
L1 cache
1b.11
Programming Shared Memory
Multiprocessors
Several possible ways
1. Thread libraries - programmer decomposes program into
individual parallel sequences, (threads), each being able
to access shared variables declared outside threads.
Example Pthreads
2. Higher level library functions and preprocessor compiler
directives to declare shared variables and specify
parallelism. Uses threads.
Example OpenMP - industry standard. Consists of
library functions, compiler directives, and environment
variables - needs OpenMP compiler
1b.12
3. Use a modified sequential programming language -- added
syntax to declare shared variables and specify parallelism.
Example UPC (Unified Parallel C) - needs a UPC
compiler.
4. Use a specially designed parallel programming language --
with syntax to express parallelism. Compiler automatically
creates executable code for each processor (not now
common).
5. Use a regular sequential programming language such as C
and ask parallelizing compiler to convert it into parallel
executable code. Also not now common.
1b.13
Message-Passing Multicomputer
Complete computers connected through an
interconnection network:
Processor
Interconnection
network
Local
Computers
Messages
memory
1b.14
Interconnection Networks
Many explored in the 1970s and 1980s
• Limited and exhaustive interconnections
• 2- and 3-dimensional meshes
• Hypercube
• Using Switches:
– Crossbar
– Trees
– Multistage interconnection networks
1b.15
Networked Computers as a
Computing Platform
• A network of computers became a very attractive
alternative to expensive supercomputers and
parallel computer systems for high-performance
computing in early 1990s.
• Several early projects. Notable:
– Berkeley NOW (network of workstations)
project.
– NASA Beowulf project.
1b.16
Key advantages:
• Very high performance workstations and PCs
readily available at low cost.
• The latest processors can easily be
incorporated into the system as they become
available.
• Existing software can be used or modified.
1b.17
Beowulf Clusters*
• A group of interconnected “commodity”
computers achieving high performance with
low cost.
• Typically using commodity interconnects -
high speed Ethernet, and Linux OS.
* Beowulf comes from name given by NASA Goddard
Space Flight Center cluster project.
1b.18
Cluster Interconnects
• Originally fast Ethernet on low cost clusters
• Gigabit Ethernet - easy upgrade path
More Specialized/Higher Performance
• Myrinet - 2.4 Gbits/sec - disadvantage: single vendor
• cLan
• SCI (Scalable Coherent Interface)
• QNet
• Infiniband - may be important as infininband
interfaces may be integrated on next generation PCs
1b.19
Dedicated cluster with a master node
and compute nodes
User
Master node
Compute nodes
Dedicated Cluster
Ethernet interface
Switch
External network
Computers
Local network
1b.20
Software Tools for Clusters
• Based upon message passing programming model
• User-level libraries provided for explicitly specifying
messages to be sent between executing processes on
each computer .
• Use with regular programming languages (C, C++, ...).
• Can be quite difficult to program correctly as we shall
see.
Next step
• Learn the message passing
programming model, some MPI
routines, write a message-passing
program and test on the cluster.
1b.21

More Related Content

What's hot

Parallel architecture-programming
Parallel architecture-programmingParallel architecture-programming
Parallel architecture-programming
Shaveta Banda
 
Parallel computing
Parallel computingParallel computing
Parallel computing
virend111
 
Multithreading computer architecture
 Multithreading computer architecture  Multithreading computer architecture
Multithreading computer architecture
Haris456
 
Lecture 6
Lecture  6Lecture  6
Lecture 6
Mr SMAK
 
Lecture 6.1
Lecture  6.1Lecture  6.1
Lecture 6.1
Mr SMAK
 
Hardware multithreading
Hardware multithreadingHardware multithreading
Hardware multithreading
Fraboni Ec
 

What's hot (20)

parallel processing
parallel processingparallel processing
parallel processing
 
Lecture1
Lecture1Lecture1
Lecture1
 
Parallel architecture-programming
Parallel architecture-programmingParallel architecture-programming
Parallel architecture-programming
 
Parallel processing
Parallel processingParallel processing
Parallel processing
 
Introduction to parallel_computing
Introduction to parallel_computingIntroduction to parallel_computing
Introduction to parallel_computing
 
Parallel computing
Parallel computingParallel computing
Parallel computing
 
Multithreading computer architecture
 Multithreading computer architecture  Multithreading computer architecture
Multithreading computer architecture
 
Multithreaded processors ppt
Multithreaded processors pptMultithreaded processors ppt
Multithreaded processors ppt
 
Mimd
MimdMimd
Mimd
 
Lecture 6
Lecture  6Lecture  6
Lecture 6
 
Introduction 1
Introduction 1Introduction 1
Introduction 1
 
Multiprocessor
MultiprocessorMultiprocessor
Multiprocessor
 
Lecture 6.1
Lecture  6.1Lecture  6.1
Lecture 6.1
 
Lec 2 (parallel design and programming)
Lec 2 (parallel design and programming)Lec 2 (parallel design and programming)
Lec 2 (parallel design and programming)
 
Dichotomy of parallel computing platforms
Dichotomy of parallel computing platformsDichotomy of parallel computing platforms
Dichotomy of parallel computing platforms
 
Hardware multithreading
Hardware multithreadingHardware multithreading
Hardware multithreading
 
Cache coherence problem and its solutions
Cache coherence problem and its solutionsCache coherence problem and its solutions
Cache coherence problem and its solutions
 
network ram parallel computing
network ram parallel computingnetwork ram parallel computing
network ram parallel computing
 
Parallelism
ParallelismParallelism
Parallelism
 
Parallel computing and its applications
Parallel computing and its applicationsParallel computing and its applications
Parallel computing and its applications
 

Viewers also liked

Android and Smartphones
Android and SmartphonesAndroid and Smartphones
Android and Smartphones
Philip David
 

Viewers also liked (20)

November 2012 announcements
November 2012 announcementsNovember 2012 announcements
November 2012 announcements
 
Announcements for july 2013
Announcements for july 2013Announcements for july 2013
Announcements for july 2013
 
Announcements for Feb 19 2012
Announcements for Feb 19 2012Announcements for Feb 19 2012
Announcements for Feb 19 2012
 
Announcements for March 11 2012
Announcements for March 11 2012Announcements for March 11 2012
Announcements for March 11 2012
 
Command GM
Command GMCommand GM
Command GM
 
Plumbing Point Loma- Frozen Pipes
Plumbing Point Loma- Frozen PipesPlumbing Point Loma- Frozen Pipes
Plumbing Point Loma- Frozen Pipes
 
February 2013 announcements
February 2013 announcementsFebruary 2013 announcements
February 2013 announcements
 
September 2012 announcements
September 2012 announcementsSeptember 2012 announcements
September 2012 announcements
 
July 2012 announcements
July 2012 announcementsJuly 2012 announcements
July 2012 announcements
 
Android and Smartphones
Android and SmartphonesAndroid and Smartphones
Android and Smartphones
 
September 2012 announcements
September 2012 announcementsSeptember 2012 announcements
September 2012 announcements
 
Announcements for june 2014
Announcements for june 2014Announcements for june 2014
Announcements for june 2014
 
Horror film trailer analysis
Horror film trailer analysisHorror film trailer analysis
Horror film trailer analysis
 
Announcements for june 2013
Announcements for june 2013Announcements for june 2013
Announcements for june 2013
 
November 2012 announcements
November 2012 announcementsNovember 2012 announcements
November 2012 announcements
 
赤字決算の対処法
赤字決算の対処法赤字決算の対処法
赤字決算の対処法
 
Announcements for May 2014
Announcements for May 2014Announcements for May 2014
Announcements for May 2014
 
Announcements for june 2013
Announcements for june 2013Announcements for june 2013
Announcements for june 2013
 
GCC
GCCGCC
GCC
 
第3回twitter研究会「閉会の挨拶」
第3回twitter研究会「閉会の挨拶」第3回twitter研究会「閉会の挨拶」
第3回twitter研究会「閉会の挨拶」
 

Similar to Paralle programming 2

finaldraft-intelcorei5processorsarchitecture-130207093535-phpapp01.pdf
finaldraft-intelcorei5processorsarchitecture-130207093535-phpapp01.pdffinaldraft-intelcorei5processorsarchitecture-130207093535-phpapp01.pdf
finaldraft-intelcorei5processorsarchitecture-130207093535-phpapp01.pdf
NazarAhmadAlkhidir
 
Multi-core architectures
Multi-core architecturesMulti-core architectures
Multi-core architectures
nextlib
 
Processes and Threads in Windows Vista
Processes and Threads in Windows VistaProcesses and Threads in Windows Vista
Processes and Threads in Windows Vista
Trinh Phuc Tho
 
IT Book of Knowledge
IT Book of KnowledgeIT Book of Knowledge
IT Book of Knowledge
Phil Primeau
 

Similar to Paralle programming 2 (20)

finaldraft-intelcorei5processorsarchitecture-130207093535-phpapp01.pdf
finaldraft-intelcorei5processorsarchitecture-130207093535-phpapp01.pdffinaldraft-intelcorei5processorsarchitecture-130207093535-phpapp01.pdf
finaldraft-intelcorei5processorsarchitecture-130207093535-phpapp01.pdf
 
introduction.pdf
introduction.pdfintroduction.pdf
introduction.pdf
 
Multicore processor by Ankit Raj and Akash Prajapati
Multicore processor by Ankit Raj and Akash PrajapatiMulticore processor by Ankit Raj and Akash Prajapati
Multicore processor by Ankit Raj and Akash Prajapati
 
The Linux System
The Linux SystemThe Linux System
The Linux System
 
Multi-core architectures
Multi-core architecturesMulti-core architectures
Multi-core architectures
 
Cluster computer
Cluster  computerCluster  computer
Cluster computer
 
General Purpose GPU Computing
General Purpose GPU ComputingGeneral Purpose GPU Computing
General Purpose GPU Computing
 
Lecture 4.pptx
Lecture 4.pptxLecture 4.pptx
Lecture 4.pptx
 
Building Embedded Linux Full Tutorial for ARM
Building Embedded Linux Full Tutorial for ARMBuilding Embedded Linux Full Tutorial for ARM
Building Embedded Linux Full Tutorial for ARM
 
Ch04 threads
Ch04 threadsCh04 threads
Ch04 threads
 
Intel new processors
Intel new processorsIntel new processors
Intel new processors
 
Processors and its Types
Processors and its TypesProcessors and its Types
Processors and its Types
 
Processes and Threads in Windows Vista
Processes and Threads in Windows VistaProcesses and Threads in Windows Vista
Processes and Threads in Windows Vista
 
SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)
SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)
SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)
 
Computer architecture lesson 1
Computer architecture lesson 1Computer architecture lesson 1
Computer architecture lesson 1
 
Linux one vs x86
Linux one vs x86 Linux one vs x86
Linux one vs x86
 
Linux one vs x86 18 july
Linux one vs x86 18 julyLinux one vs x86 18 july
Linux one vs x86 18 july
 
Parallel_and_Cluster_Computing.ppt
Parallel_and_Cluster_Computing.pptParallel_and_Cluster_Computing.ppt
Parallel_and_Cluster_Computing.ppt
 
IT Book of Knowledge
IT Book of KnowledgeIT Book of Knowledge
IT Book of Knowledge
 
Ca lecture 03
Ca lecture 03Ca lecture 03
Ca lecture 03
 

More from Anshul Sharma (11)

Understanding concurrency
Understanding concurrencyUnderstanding concurrency
Understanding concurrency
 
Interm codegen
Interm codegenInterm codegen
Interm codegen
 
Programming using Open Mp
Programming using Open MpProgramming using Open Mp
Programming using Open Mp
 
Open MPI 2
Open MPI 2Open MPI 2
Open MPI 2
 
Open MPI
Open MPIOpen MPI
Open MPI
 
Parallel programming
Parallel programmingParallel programming
Parallel programming
 
Cuda 3
Cuda 3Cuda 3
Cuda 3
 
Cuda 2
Cuda 2Cuda 2
Cuda 2
 
Cuda intro
Cuda introCuda intro
Cuda intro
 
Des
DesDes
Des
 
Intoduction to Linux
Intoduction to LinuxIntoduction to Linux
Intoduction to Linux
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Recently uploaded (20)

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 

Paralle programming 2

  • 1. 1b.1 Types of Parallel Computers Two principal approaches: • Shared memory multiprocessor • Distributed memory multicomputer ITCS 4/5145 Parallel Programming, UNC-Charlotte, B. Wilkinson, 2010. Aug 26, 2010
  • 3. 1b.3 Conventional Computer Consists of a processor executing a program stored in a (main) memory: Each main memory location located by its address. Addresses start at 0 and extend to 2b - 1 when there are b bits (binary digits) in address. Main memory Processor Instructions (to processor) Data (to or from processor)
  • 4. 1b.4 Shared Memory Multiprocessor System Natural way to extend single processor model - have multiple processors connected to multiple memory modules, such that each processor can access any memory module: Processors Processor-memory Interconnections Memory module One address space
  • 5. 1b.5 Simplistic view of a small shared memory multiprocessor Examples: • Dual Pentiums • Quad Pentiums Processors Shared memory Bus
  • 6. 1b.6 Real computer system have cache memory between the main memory and processors. Level 1 (L1) cache and Level 2 (L2) cache. Example Quad Shared Memory Multiprocessor Processor L2 Cache Bus interface L1 cache Processor L2 Cache Bus interface L1 cache Processor L2 Cache Bus interface L1 cache Processor L2 Cache Bus interface L1 cache Memory controller Memory Processor/ memory bus Shared memory
  • 7. 1b.7 “Recent” innovation • Dual-core and multi-core processors • Two or more independent processors in one package • Actually an old idea but not put into wide practice until recently. • Since L1 cache is usually inside package and L2 cache outside package, dual-/multi-core processors usually share L2 cache.
  • 8. 1b.8 Single quad core shared memory multiprocessor L2 Cache Memory controller Memory Shared memory Chip Processor L1 cache Processor L1 cache Processor L1 cache Processor L1 cache
  • 9. 1b.9 Examples • Intel: – Core Dual processors -- Two processors in one package sharing a common L2 Cache. 2005-2006 – Intel Core 2 family dual cores, with quad core from Nov 2006 onwards – Core i7 processors replacing Core 2 family - Quad core Nov 2008 – Intel Teraflops Research Chip (Polaris), a 3.16 GHz, 80- core processor prototype. • Xbox 360 game console -- triple core PowerPC microprocessor. • PlayStation 3 Cell processor -- 9 core design. References and more information -- wikipedia
  • 10. 1b.10 Multiple quad-core multiprocessors (example coit-grid05.uncc.edu) Memory controller Memory Shared memory L2 Cache possible L3 cache Processor L1 cache Processor L1 cache Processor L1 cache Processor L1 cache Processor L1 cache Processor L1 cache Processor L1 cache Processor L1 cache
  • 11. 1b.11 Programming Shared Memory Multiprocessors Several possible ways 1. Thread libraries - programmer decomposes program into individual parallel sequences, (threads), each being able to access shared variables declared outside threads. Example Pthreads 2. Higher level library functions and preprocessor compiler directives to declare shared variables and specify parallelism. Uses threads. Example OpenMP - industry standard. Consists of library functions, compiler directives, and environment variables - needs OpenMP compiler
  • 12. 1b.12 3. Use a modified sequential programming language -- added syntax to declare shared variables and specify parallelism. Example UPC (Unified Parallel C) - needs a UPC compiler. 4. Use a specially designed parallel programming language -- with syntax to express parallelism. Compiler automatically creates executable code for each processor (not now common). 5. Use a regular sequential programming language such as C and ask parallelizing compiler to convert it into parallel executable code. Also not now common.
  • 13. 1b.13 Message-Passing Multicomputer Complete computers connected through an interconnection network: Processor Interconnection network Local Computers Messages memory
  • 14. 1b.14 Interconnection Networks Many explored in the 1970s and 1980s • Limited and exhaustive interconnections • 2- and 3-dimensional meshes • Hypercube • Using Switches: – Crossbar – Trees – Multistage interconnection networks
  • 15. 1b.15 Networked Computers as a Computing Platform • A network of computers became a very attractive alternative to expensive supercomputers and parallel computer systems for high-performance computing in early 1990s. • Several early projects. Notable: – Berkeley NOW (network of workstations) project. – NASA Beowulf project.
  • 16. 1b.16 Key advantages: • Very high performance workstations and PCs readily available at low cost. • The latest processors can easily be incorporated into the system as they become available. • Existing software can be used or modified.
  • 17. 1b.17 Beowulf Clusters* • A group of interconnected “commodity” computers achieving high performance with low cost. • Typically using commodity interconnects - high speed Ethernet, and Linux OS. * Beowulf comes from name given by NASA Goddard Space Flight Center cluster project.
  • 18. 1b.18 Cluster Interconnects • Originally fast Ethernet on low cost clusters • Gigabit Ethernet - easy upgrade path More Specialized/Higher Performance • Myrinet - 2.4 Gbits/sec - disadvantage: single vendor • cLan • SCI (Scalable Coherent Interface) • QNet • Infiniband - may be important as infininband interfaces may be integrated on next generation PCs
  • 19. 1b.19 Dedicated cluster with a master node and compute nodes User Master node Compute nodes Dedicated Cluster Ethernet interface Switch External network Computers Local network
  • 20. 1b.20 Software Tools for Clusters • Based upon message passing programming model • User-level libraries provided for explicitly specifying messages to be sent between executing processes on each computer . • Use with regular programming languages (C, C++, ...). • Can be quite difficult to program correctly as we shall see.
  • 21. Next step • Learn the message passing programming model, some MPI routines, write a message-passing program and test on the cluster. 1b.21