SlideShare une entreprise Scribd logo
1  sur  27
Parallel Algorithms

      Sorting
     and more
Keep hardware in mind
• When considering ‘parallel’ algorithms,
  – We have to have an understanding of the
    hardware they will run on

  – Sequential algorithms: we are doing this implicitly
Creative use of processing power
• Lots of data = need for speed
• ~20 years : parallel processing
  – Studying how to use multiple processors together
  – Really large and complex computations
  – Parallel processing was an active sub-field of CS


• Since 2005: the era of multicore is here
  – All computers will have >1 processing unit
Traditional Computing Machine
• Von Neumann model:
  – The stored program computer


• What is this?
  – Abstractly, what does it look like?
New twist: multiple control units
• It’s difficult to make the CPU any faster
  – To increase potential speed, add more CPUs
  – These CPUs are called cores


• Abstractly, what might this look like in these
  new machines?
Shared memory model
• Multiple processors can access memory
  locations

• May not scale over time
  – As we increase the ‘cores’
Other ‘parallel’ configurations:
  • Clusters of computers
    – Network connects them
Other ‘parallel’ configurations
• Massive data centers
Clusters and data centers
• Distributed memory model
Algorithms
• We will use term processor for the processing
  unit that executes instructions

• When considering how to design algorithms for
  these architectures
  – Useful to start with a base theoretical model
  – Revise when implementing on different hardware with
    software packages
     • Parallel computing course

  – Also consider:
     • Memory location access by ‘competing’/’cooperating’
       processors
     • Theoretical arrangement of the processors
PRAM model
• Parallel Random Access Machine
• Theoretical

• Abstractly, what does it look like?
• How do processors access memory in this
  PRAM model?
PRAM model
• Why is using the PRAM model useful when
  studying algorithms?
PRAM model
• Processors working in parallel
  – Each trying to access memory values
  – Memory value: what do we mean by this?


• When designing algorithms, we need to
  consider what type of memory access that
  algorithm requires
     • How might our theoretical computer work when many
       reads and writes are happening at the same time?
Designing algorithms
• With many algorithms, we’re moving data around
  – Sort, e.g.    Others?

• Concurrent reads by multiple processors
  – Memory not changed, so no ‘conflicts’

• Exclusive writes (EW)
  – Design pseudocode so that any processor is
    exclusively writing a data value into a memory
    location
Designing Algorithms
• Arranging the processors
  – Helpful for design of algorithm
     • We can envision how it works
     • We can envision the data access pattern needed
        – EREW, CREW (CRCW)
  – Not how processors are necessarily arranged in
    practice
     • Although some machines have been


  – What are some possible arrangements?
  – Why might these arrangements prove useful for
    design?
Arrangements
Sorting in Parallel

Emphasis: merge sort
Sequential merge sort
• Recursive                          function mergesort(m)
                                     var list left, right
   – Can envision a recursion tree     if length(m) ≤ 1
                                           return m
                                       else
                                           middle = length(m) / 2

                                         for each x in m up to middle
                                           add x to left

                                         for each x in m after middle
                                           add x to right

                                         left = mergesort(left)

                                         right = mergesort(right)

                                         result = merge(left, right)

                                         return result
Parallel merge sort
• Shared data: 2 lists in memory     How might we write the
• Sort pairs once in parallel          pseudocode?
• The processes merge concurrently
Parallel merge sort
• Shared data: 2 lists in memory     How might we write the
• Sort pairs once in parallel          pseudocode?
• The processes merge concurrently
                                     Numbering of processors starts with 0

                                     s=2
                                     while s<= N
                                        do in parallel N/s steps for proc i
                                          merge values from i*s to (s*i)+s -1
                                     s = s*2
Parallel Merge Sort
• Work through pseudocode with larger N

• Processor Arrangement: binary tree
• Memory access: EREW

• What was the more practical implementation?
Let’s try others

Different from sorting
Activity: Sum N integers
• Suppose we have an array of N integers in
  memory
• We wish to sum them
  – Variant: create a running sum in a new array

• Devise a parallel algorithm for this
  – Assume PRAM to start
  – What processor arrangement did you use?
  – What memory access is required?
Next Activity
• Now suppose you need an algorithm for
  multiplying a matrix by a vector


                               X           =




            Matrix A                Vector X                Result Vector

Devise a parallel algorithm for this
     Assume PRAM to start
       Think about what each process will compute- there are options
     What processor arrangement did you use?
     What memory access is required?
Matrix-Vector Multiplication
•   The matrix is assumed to be M x N. In other words:
     – The matrix has M rows.
     – The matrix has N columns.
     – For example, a 3 x 2 matrix has 3 rows and 2 columns.

•   In matrix-vector multiplication, if the matrix is M x N, then the
    vector must have a dimension,N.
     – In other words, the vector will have N entries.
     – If the matrix is 3 x 2, then the vector must be 3 dimensional.
     – This is usually stated as saying the matrix and vector must be
        conformable.
• Then, if the matrix and vector are conformable, the
product of the matrix and the vector is a resultant vector
that has a dimension of M.
     (So, the result could be a different size than the
     original vector!)
     For example, if the matrix is 3 x 2, and the vector is 3
     dimensional, the result of the multiplication would be
     a vector of 2 dimensions
Matrix-Vector Multiplication
• Ways to do a parallel algorithm:
  – One row of matrix per processor
  – One element of matrix per processor
     • There is additional overhead involved   why?

• What if number of rows M is larger than
  number of processors?

• Emerging theme: how to partition the data
Expand on previous example
• Matrix – Matrix multiplication


                       =




                   X
                                   = ?

Contenu connexe

En vedette

新ポートフォリオ
新ポートフォリオ新ポートフォリオ
新ポートフォリオ
jt_test
 
Разработка и коммерческая реализация аппаратно-программных комплексов и метод...
Разработка и коммерческая реализация аппаратно-программных комплексов и метод...Разработка и коммерческая реализация аппаратно-программных комплексов и метод...
Разработка и коммерческая реализация аппаратно-программных комплексов и метод...
kulibin
 
Killing any security product … using a Mimikatz undocumented feature
Killing any security product … using a Mimikatz undocumented featureKilling any security product … using a Mimikatz undocumented feature
Killing any security product … using a Mimikatz undocumented feature
Cyber Security Alliance
 

En vedette (16)

戴紅玫瑰的醜女人
戴紅玫瑰的醜女人戴紅玫瑰的醜女人
戴紅玫瑰的醜女人
 
新ポートフォリオ
新ポートフォリオ新ポートフォリオ
新ポートフォリオ
 
Разработка и коммерческая реализация аппаратно-программных комплексов и метод...
Разработка и коммерческая реализация аппаратно-программных комплексов и метод...Разработка и коммерческая реализация аппаратно-программных комплексов и метод...
Разработка и коммерческая реализация аппаратно-программных комплексов и метод...
 
The Inner Two as Law of Attraction or Creation
The Inner Two as Law of Attraction or CreationThe Inner Two as Law of Attraction or Creation
The Inner Two as Law of Attraction or Creation
 
Killing any security product … using a Mimikatz undocumented feature
Killing any security product … using a Mimikatz undocumented featureKilling any security product … using a Mimikatz undocumented feature
Killing any security product … using a Mimikatz undocumented feature
 
всемирная креативная культура
всемирная креативная культуравсемирная креативная культура
всемирная креативная культура
 
Rock and roll history
Rock and roll historyRock and roll history
Rock and roll history
 
Electrochemical polarization
Electrochemical polarizationElectrochemical polarization
Electrochemical polarization
 
Flipbook is the media improving our health- (1)
Flipbook  is the media improving our health- (1)Flipbook  is the media improving our health- (1)
Flipbook is the media improving our health- (1)
 
Enfermedades neurodegenerativas. neuroalianza.
Enfermedades neurodegenerativas. neuroalianza.Enfermedades neurodegenerativas. neuroalianza.
Enfermedades neurodegenerativas. neuroalianza.
 
Rock and roll history upload for facebook
Rock and roll history upload for facebookRock and roll history upload for facebook
Rock and roll history upload for facebook
 
Start reddit
Start redditStart reddit
Start reddit
 
Public Opinion Landscape: Economy 5.25.16
Public Opinion Landscape: Economy 5.25.16Public Opinion Landscape: Economy 5.25.16
Public Opinion Landscape: Economy 5.25.16
 
Using Social Media for Continuity & Emergency Management (intro)
Using Social Media for Continuity & Emergency Management (intro)Using Social Media for Continuity & Emergency Management (intro)
Using Social Media for Continuity & Emergency Management (intro)
 
Social Trends from 2014 by Eric Drumm, Account Supervisor, Social@Ogilvy
Social Trends from 2014 by Eric Drumm, Account Supervisor, Social@OgilvySocial Trends from 2014 by Eric Drumm, Account Supervisor, Social@Ogilvy
Social Trends from 2014 by Eric Drumm, Account Supervisor, Social@Ogilvy
 
Ppt tayyarat
Ppt tayyaratPpt tayyarat
Ppt tayyarat
 

Similaire à In-class slides with activities

Neural Networks for Machine Learning and Deep Learning
Neural Networks for Machine Learning and Deep LearningNeural Networks for Machine Learning and Deep Learning
Neural Networks for Machine Learning and Deep Learning
comifa7406
 

Similaire à In-class slides with activities (20)

TensorFlow.pptx
TensorFlow.pptxTensorFlow.pptx
TensorFlow.pptx
 
unit 2 hpc.pptx
unit 2 hpc.pptxunit 2 hpc.pptx
unit 2 hpc.pptx
 
CSA unit5.pptx
CSA unit5.pptxCSA unit5.pptx
CSA unit5.pptx
 
Deep learning from scratch
Deep learning from scratch Deep learning from scratch
Deep learning from scratch
 
lecture 01.1.ppt
lecture 01.1.pptlecture 01.1.ppt
lecture 01.1.ppt
 
Matlab pt1
Matlab pt1Matlab pt1
Matlab pt1
 
Parallel Computing-Part-1.pptx
Parallel Computing-Part-1.pptxParallel Computing-Part-1.pptx
Parallel Computing-Part-1.pptx
 
Lecture 1 (bce-7)
Lecture   1 (bce-7)Lecture   1 (bce-7)
Lecture 1 (bce-7)
 
parallel Questions &amp; answers
parallel Questions &amp; answersparallel Questions &amp; answers
parallel Questions &amp; answers
 
Unit-3.ppt
Unit-3.pptUnit-3.ppt
Unit-3.ppt
 
05 k-means clustering
05 k-means clustering05 k-means clustering
05 k-means clustering
 
Chap5 slides
Chap5 slidesChap5 slides
Chap5 slides
 
Algorithm analysis (All in one)
Algorithm analysis (All in one)Algorithm analysis (All in one)
Algorithm analysis (All in one)
 
Introduction to Machine learning & Neural Networks
Introduction to Machine learning & Neural NetworksIntroduction to Machine learning & Neural Networks
Introduction to Machine learning & Neural Networks
 
Matrix multiplication
Matrix multiplicationMatrix multiplication
Matrix multiplication
 
Lecture1
Lecture1Lecture1
Lecture1
 
Parallel processing
Parallel processingParallel processing
Parallel processing
 
Understanding Basics of Machine Learning
Understanding Basics of Machine LearningUnderstanding Basics of Machine Learning
Understanding Basics of Machine Learning
 
Neural Networks for Machine Learning and Deep Learning
Neural Networks for Machine Learning and Deep LearningNeural Networks for Machine Learning and Deep Learning
Neural Networks for Machine Learning and Deep Learning
 
Data structure and algorithm.
Data structure and algorithm. Data structure and algorithm.
Data structure and algorithm.
 

Plus de SERC at Carleton College

StatVignette03_Sig.Figs_v04_07_15_2020.pptx
StatVignette03_Sig.Figs_v04_07_15_2020.pptxStatVignette03_Sig.Figs_v04_07_15_2020.pptx
StatVignette03_Sig.Figs_v04_07_15_2020.pptx
SERC at Carleton College
 
Cretaceous Coatlines and Modern Voting Patterns Presentation
Cretaceous Coatlines and Modern Voting Patterns PresentationCretaceous Coatlines and Modern Voting Patterns Presentation
Cretaceous Coatlines and Modern Voting Patterns Presentation
SERC at Carleton College
 
Presentation: Unit 1 Introduction to the hydrological cycle
Presentation: Unit 1 Introduction to the hydrological cyclePresentation: Unit 1 Introduction to the hydrological cycle
Presentation: Unit 1 Introduction to the hydrological cycle
SERC at Carleton College
 
KSKL_Chapter 4_ Chem Properties of Soils.pptx
KSKL_Chapter 4_ Chem Properties of Soils.pptxKSKL_Chapter 4_ Chem Properties of Soils.pptx
KSKL_Chapter 4_ Chem Properties of Soils.pptx
SERC at Carleton College
 

Plus de SERC at Carleton College (20)

StatVignette03_Sig.Figs_v04_07_15_2020.pptx
StatVignette03_Sig.Figs_v04_07_15_2020.pptxStatVignette03_Sig.Figs_v04_07_15_2020.pptx
StatVignette03_Sig.Figs_v04_07_15_2020.pptx
 
StatVignette06_HypTesting.pptx
StatVignette06_HypTesting.pptxStatVignette06_HypTesting.pptx
StatVignette06_HypTesting.pptx
 
Unit 1 (optional slides)
Unit 1 (optional slides)Unit 1 (optional slides)
Unit 1 (optional slides)
 
Cretaceous Coatlines and Modern Voting Patterns Presentation
Cretaceous Coatlines and Modern Voting Patterns PresentationCretaceous Coatlines and Modern Voting Patterns Presentation
Cretaceous Coatlines and Modern Voting Patterns Presentation
 
Climate and Biomes PPT 2
Climate and Biomes PPT 2Climate and Biomes PPT 2
Climate and Biomes PPT 2
 
weather tracking ppt
weather tracking pptweather tracking ppt
weather tracking ppt
 
Presentation: Unit 1 Introduction to the hydrological cycle
Presentation: Unit 1 Introduction to the hydrological cyclePresentation: Unit 1 Introduction to the hydrological cycle
Presentation: Unit 1 Introduction to the hydrological cycle
 
StatVignette05_M3_v02_10_21_2020.pptx
StatVignette05_M3_v02_10_21_2020.pptxStatVignette05_M3_v02_10_21_2020.pptx
StatVignette05_M3_v02_10_21_2020.pptx
 
KSKL chapter 8 PPT
KSKL chapter 8 PPTKSKL chapter 8 PPT
KSKL chapter 8 PPT
 
KSKL chap 5 PPT
KSKL chap 5 PPTKSKL chap 5 PPT
KSKL chap 5 PPT
 
KSKL_Chapter 4_ Chem Properties of Soils.pptx
KSKL_Chapter 4_ Chem Properties of Soils.pptxKSKL_Chapter 4_ Chem Properties of Soils.pptx
KSKL_Chapter 4_ Chem Properties of Soils.pptx
 
Degraded Soil Images.pptx
Degraded Soil Images.pptxDegraded Soil Images.pptx
Degraded Soil Images.pptx
 
Educators PPT file chapter 7
Educators PPT file chapter 7Educators PPT file chapter 7
Educators PPT file chapter 7
 
Educators PPT file chapter 2
Educators PPT file chapter 2Educators PPT file chapter 2
Educators PPT file chapter 2
 
Educators PPT file chapter 6
Educators PPT file chapter 6Educators PPT file chapter 6
Educators PPT file chapter 6
 
Educators PPT chapter 3
Educators PPT chapter 3Educators PPT chapter 3
Educators PPT chapter 3
 
Unit 4 background presentation
Unit 4 background presentationUnit 4 background presentation
Unit 4 background presentation
 
Presentation: Unit 3 background information
Presentation: Unit 3 background informationPresentation: Unit 3 background information
Presentation: Unit 3 background information
 
Presentation: Unit 2 Measuring Groundwater Background Information
Presentation: Unit 2 Measuring Groundwater Background InformationPresentation: Unit 2 Measuring Groundwater Background Information
Presentation: Unit 2 Measuring Groundwater Background Information
 
Introduction to GPS presentation
Introduction to GPS presentationIntroduction to GPS presentation
Introduction to GPS presentation
 

In-class slides with activities

  • 1. Parallel Algorithms Sorting and more
  • 2. Keep hardware in mind • When considering ‘parallel’ algorithms, – We have to have an understanding of the hardware they will run on – Sequential algorithms: we are doing this implicitly
  • 3. Creative use of processing power • Lots of data = need for speed • ~20 years : parallel processing – Studying how to use multiple processors together – Really large and complex computations – Parallel processing was an active sub-field of CS • Since 2005: the era of multicore is here – All computers will have >1 processing unit
  • 4. Traditional Computing Machine • Von Neumann model: – The stored program computer • What is this? – Abstractly, what does it look like?
  • 5. New twist: multiple control units • It’s difficult to make the CPU any faster – To increase potential speed, add more CPUs – These CPUs are called cores • Abstractly, what might this look like in these new machines?
  • 6. Shared memory model • Multiple processors can access memory locations • May not scale over time – As we increase the ‘cores’
  • 7. Other ‘parallel’ configurations: • Clusters of computers – Network connects them
  • 9. Clusters and data centers • Distributed memory model
  • 10. Algorithms • We will use term processor for the processing unit that executes instructions • When considering how to design algorithms for these architectures – Useful to start with a base theoretical model – Revise when implementing on different hardware with software packages • Parallel computing course – Also consider: • Memory location access by ‘competing’/’cooperating’ processors • Theoretical arrangement of the processors
  • 11. PRAM model • Parallel Random Access Machine • Theoretical • Abstractly, what does it look like? • How do processors access memory in this PRAM model?
  • 12. PRAM model • Why is using the PRAM model useful when studying algorithms?
  • 13. PRAM model • Processors working in parallel – Each trying to access memory values – Memory value: what do we mean by this? • When designing algorithms, we need to consider what type of memory access that algorithm requires • How might our theoretical computer work when many reads and writes are happening at the same time?
  • 14. Designing algorithms • With many algorithms, we’re moving data around – Sort, e.g. Others? • Concurrent reads by multiple processors – Memory not changed, so no ‘conflicts’ • Exclusive writes (EW) – Design pseudocode so that any processor is exclusively writing a data value into a memory location
  • 15. Designing Algorithms • Arranging the processors – Helpful for design of algorithm • We can envision how it works • We can envision the data access pattern needed – EREW, CREW (CRCW) – Not how processors are necessarily arranged in practice • Although some machines have been – What are some possible arrangements? – Why might these arrangements prove useful for design?
  • 18. Sequential merge sort • Recursive function mergesort(m) var list left, right – Can envision a recursion tree if length(m) ≤ 1 return m else middle = length(m) / 2 for each x in m up to middle add x to left for each x in m after middle add x to right left = mergesort(left) right = mergesort(right) result = merge(left, right) return result
  • 19. Parallel merge sort • Shared data: 2 lists in memory How might we write the • Sort pairs once in parallel pseudocode? • The processes merge concurrently
  • 20. Parallel merge sort • Shared data: 2 lists in memory How might we write the • Sort pairs once in parallel pseudocode? • The processes merge concurrently Numbering of processors starts with 0 s=2 while s<= N do in parallel N/s steps for proc i merge values from i*s to (s*i)+s -1 s = s*2
  • 21. Parallel Merge Sort • Work through pseudocode with larger N • Processor Arrangement: binary tree • Memory access: EREW • What was the more practical implementation?
  • 23. Activity: Sum N integers • Suppose we have an array of N integers in memory • We wish to sum them – Variant: create a running sum in a new array • Devise a parallel algorithm for this – Assume PRAM to start – What processor arrangement did you use? – What memory access is required?
  • 24. Next Activity • Now suppose you need an algorithm for multiplying a matrix by a vector X = Matrix A Vector X Result Vector Devise a parallel algorithm for this Assume PRAM to start Think about what each process will compute- there are options What processor arrangement did you use? What memory access is required?
  • 25. Matrix-Vector Multiplication • The matrix is assumed to be M x N. In other words: – The matrix has M rows. – The matrix has N columns. – For example, a 3 x 2 matrix has 3 rows and 2 columns. • In matrix-vector multiplication, if the matrix is M x N, then the vector must have a dimension,N. – In other words, the vector will have N entries. – If the matrix is 3 x 2, then the vector must be 3 dimensional. – This is usually stated as saying the matrix and vector must be conformable. • Then, if the matrix and vector are conformable, the product of the matrix and the vector is a resultant vector that has a dimension of M. (So, the result could be a different size than the original vector!) For example, if the matrix is 3 x 2, and the vector is 3 dimensional, the result of the multiplication would be a vector of 2 dimensions
  • 26. Matrix-Vector Multiplication • Ways to do a parallel algorithm: – One row of matrix per processor – One element of matrix per processor • There is additional overhead involved why? • What if number of rows M is larger than number of processors? • Emerging theme: how to partition the data
  • 27. Expand on previous example • Matrix – Matrix multiplication = X = ?

Notes de l'éditeur

  1. CPU = control unit + ALUCPU executes instructionsMain memory + cache memory
  2. It helps us reason about the complexity of an algorithm- understand the best that it may perform
  3. Memory values = single word, or more simply an integer, float, characterEREWCREWCRCW