SlideShare une entreprise Scribd logo
1  sur  18
Télécharger pour lire hors ligne
Scalability comparison: Traditional fork-join-based
parallelism vs. Goroutines
Porting the Barcelona OpenMP Tasks Suite to Go
Artjom Simon
https://github.com/artjomsimon/go-bots
Know Your Gophers
2015-05-12
Traditional approach in C
Cilk:
cilk_spawn task();
[...]
cilk_sync;
OpenMP:
#pragma omp parallel
{
#pragma omp task
[...]
#pragma omp taskwait
[...]
}
Go: Parallel For Loop Pattern1
queue := make(chan int)
done := make(chan bool)
NP := runtime.GOMAXPROCS(0)
go func() {
for i := 0; i < n; i++ { queue <- i }
close(queue)
}()
for i := 0; i < NP; i++ {
go func() {
for i := range queue { work(i) }
done<-true
}()
}
for i := 0; i < NP; i++ { <-done }
1
Benchmarking Usability and Performance of Multicore Languages, PDF:
http://arxiv.org/pdf/1302.2837v2
Barcelona OpenMP Tasks Suite2
2
https://github.com/alcides/bots
...used in academic publications3
3
http://www.sarc-ip.org/files/null/Workshop/1234128788173_
_TSchedStrat-iwomp08.pdf
Micro benchmarks
1 8 16 32 48
1
8
16
32
48
OMP_NUM_THREADS
Speeduprel.toseq.
spc (opteron)
n=1000µs
n=100µs
n=10µs
Figure: Speedup spc (icc), 10 000 Tasks
Task pools: Variations
• notaskpool
Start Goroutines as needed, no limitation, uses WaitGroup for
synchronization
• simple-queue
Buffered channel of func()s holds task queue. n goroutines
receive the func()s and execute them
• goroutines-dispatcher
Dispatcher function, executing tasks in Goroutine only if a
global counter of running goroutines is < n
• const-goroutines
n goroutines remove tasks from a double-linked list
Micro benchmarks
1 8 16 32 48
1
8
16
32
48
OMP_NUM_THREADS
Speeduprel.zusequentiell
spc (opteron)
gcc
icc
clang
go-notaskpool
go-simple-queue
go-const-goroutines
go-goroutine-dispatch
Figure: Speedup spc, n=100µs, 10 000 Tasks
BOTS: nqueens
• N-Queens problem with n=12
• Recursive backtracking search
• No cut-off when creating tasks
Ergebnisse: BOTS (nqueens)
1 8 16 32 48
0
5
10
CPU cores
Speedup nqueens (opteron)
gcc
icc
clang
go-const-goroutines
go-dispatch
go-notaskpool
gccgo-const-goroutines
gccgo-dispatch
gccgo-notaskpool
Figure: Speedup for nqueens -n 12, parallel
BOTS: sparselu
• LU factorization of a sparse block matrix
• 50x50-Matrix, 100x100 sub block matrices
Results: BOTS (sparselu)
1 8 16 32 48
0
10
20
30
CPU cores
Speedup sparselu (opteron)
gcc
icc
clang
go-const-goroutines
go-dispatch
go-notaskpool
go-simplequeue
gccgo-const-goroutines
gccgo-dispatch
gccgo-notaskpool
Figure: Speedup sparselu -n 50 -m 100, parallel
Problem: Recursion (dependencies!)
Memory
opteron
0
0.5
1
1.5
·105
RSS[Kbytes] spc-par 10000 1000, 4 Threads
gcc
icc
clang
go-notaskpool
go-const-goroutines
gccgo-notaskpool
gccgo-const-goroutines
Figure: Memory comparison (Resident Set Size), spc parallel
Side effect: Possible heap corruption bug in Go 1.4?
Questions?
Thank you!
Image credits
Icon N-Queens problem: Colin M.L. Burnett, Wikimedia Commons,
(GFDL & BSD & GPL)
http://commons.wikimedia.org/wiki/File:Chess_d45.svg
(2015-03-09)

Contenu connexe

Tendances

jimmy hacking (at) Microsoft
jimmy hacking (at) Microsoftjimmy hacking (at) Microsoft
jimmy hacking (at) MicrosoftJimmy Schementi
 
Introduction to nand2 tetris
Introduction to nand2 tetrisIntroduction to nand2 tetris
Introduction to nand2 tetrisYodalee
 
Event Loop in Javascript
Event Loop in JavascriptEvent Loop in Javascript
Event Loop in JavascriptDiptiGandhi4
 
function* - ES6, generators, and all that (JSRomandie meetup, February 2014)
function* - ES6, generators, and all that (JSRomandie meetup, February 2014)function* - ES6, generators, and all that (JSRomandie meetup, February 2014)
function* - ES6, generators, and all that (JSRomandie meetup, February 2014)Igalia
 
Rubinius @ RubyAndRails2010
Rubinius @ RubyAndRails2010Rubinius @ RubyAndRails2010
Rubinius @ RubyAndRails2010Dirkjan Bussink
 
なぜ検索しなかったのか
なぜ検索しなかったのかなぜ検索しなかったのか
なぜ検索しなかったのかN Masahiro
 
Internship - Final Presentation (26-08-2015)
Internship - Final Presentation (26-08-2015)Internship - Final Presentation (26-08-2015)
Internship - Final Presentation (26-08-2015)Sean Krail
 
Troubleshooting .net core on linux
Troubleshooting .net core on linuxTroubleshooting .net core on linux
Troubleshooting .net core on linuxPavel Klimiankou
 
Incremental and parallel computation of structural graph summaries for evolvi...
Incremental and parallel computation of structural graph summaries for evolvi...Incremental and parallel computation of structural graph summaries for evolvi...
Incremental and parallel computation of structural graph summaries for evolvi...Till Blume
 
ClojureScript - A functional Lisp for the browser
ClojureScript - A functional Lisp for the browserClojureScript - A functional Lisp for the browser
ClojureScript - A functional Lisp for the browserFalko Riemenschneider
 
Introduction to RevKit
Introduction to RevKitIntroduction to RevKit
Introduction to RevKitMathias Soeken
 
Dummy log generation using poisson sampling
 Dummy log generation using poisson sampling Dummy log generation using poisson sampling
Dummy log generation using poisson samplingKwanghee Choi
 
Compiler presention
Compiler presentionCompiler presention
Compiler presentionFaria Priya
 
Gameboy emulator in rust and web assembly
Gameboy emulator in rust and web assemblyGameboy emulator in rust and web assembly
Gameboy emulator in rust and web assemblyYodalee
 
Why is a[1] fast than a.get(1)
Why is a[1]  fast than a.get(1)Why is a[1]  fast than a.get(1)
Why is a[1] fast than a.get(1)kao kuo-tung
 
[COSCUP 2018] uTensor C++ Code Generator
[COSCUP 2018] uTensor C++ Code Generator[COSCUP 2018] uTensor C++ Code Generator
[COSCUP 2018] uTensor C++ Code GeneratorYin-Chen Liao
 

Tendances (20)

jimmy hacking (at) Microsoft
jimmy hacking (at) Microsoftjimmy hacking (at) Microsoft
jimmy hacking (at) Microsoft
 
Introduction to nand2 tetris
Introduction to nand2 tetrisIntroduction to nand2 tetris
Introduction to nand2 tetris
 
Event Loop in Javascript
Event Loop in JavascriptEvent Loop in Javascript
Event Loop in Javascript
 
function* - ES6, generators, and all that (JSRomandie meetup, February 2014)
function* - ES6, generators, and all that (JSRomandie meetup, February 2014)function* - ES6, generators, and all that (JSRomandie meetup, February 2014)
function* - ES6, generators, and all that (JSRomandie meetup, February 2014)
 
Rubinius @ RubyAndRails2010
Rubinius @ RubyAndRails2010Rubinius @ RubyAndRails2010
Rubinius @ RubyAndRails2010
 
Multi qubit entanglement
Multi qubit entanglementMulti qubit entanglement
Multi qubit entanglement
 
なぜ検索しなかったのか
なぜ検索しなかったのかなぜ検索しなかったのか
なぜ検索しなかったのか
 
Internship - Final Presentation (26-08-2015)
Internship - Final Presentation (26-08-2015)Internship - Final Presentation (26-08-2015)
Internship - Final Presentation (26-08-2015)
 
Troubleshooting .net core on linux
Troubleshooting .net core on linuxTroubleshooting .net core on linux
Troubleshooting .net core on linux
 
Incremental and parallel computation of structural graph summaries for evolvi...
Incremental and parallel computation of structural graph summaries for evolvi...Incremental and parallel computation of structural graph summaries for evolvi...
Incremental and parallel computation of structural graph summaries for evolvi...
 
ClojureScript - A functional Lisp for the browser
ClojureScript - A functional Lisp for the browserClojureScript - A functional Lisp for the browser
ClojureScript - A functional Lisp for the browser
 
MFC Prog
MFC ProgMFC Prog
MFC Prog
 
Introduction to RevKit
Introduction to RevKitIntroduction to RevKit
Introduction to RevKit
 
Dummy log generation using poisson sampling
 Dummy log generation using poisson sampling Dummy log generation using poisson sampling
Dummy log generation using poisson sampling
 
Compiler presention
Compiler presentionCompiler presention
Compiler presention
 
Gameboy emulator in rust and web assembly
Gameboy emulator in rust and web assemblyGameboy emulator in rust and web assembly
Gameboy emulator in rust and web assembly
 
Why is a[1] fast than a.get(1)
Why is a[1]  fast than a.get(1)Why is a[1]  fast than a.get(1)
Why is a[1] fast than a.get(1)
 
[COSCUP 2018] uTensor C++ Code Generator
[COSCUP 2018] uTensor C++ Code Generator[COSCUP 2018] uTensor C++ Code Generator
[COSCUP 2018] uTensor C++ Code Generator
 
Toy Model Overview
Toy Model OverviewToy Model Overview
Toy Model Overview
 
Yacf
YacfYacf
Yacf
 

Similaire à Scalability comparison: Traditional fork-join-based parallelism vs. Goroutines: Porting the Barcelona OpenMP Tasks Suite to Go

Euro python2011 High Performance Python
Euro python2011 High Performance PythonEuro python2011 High Performance Python
Euro python2011 High Performance PythonIan Ozsvald
 
OmpSs – improving the scalability of OpenMP
OmpSs – improving the scalability of OpenMPOmpSs – improving the scalability of OpenMP
OmpSs – improving the scalability of OpenMPIntel IT Center
 
Global Interpreter Lock: Episode I - Break the Seal
Global Interpreter Lock: Episode I - Break the SealGlobal Interpreter Lock: Episode I - Break the Seal
Global Interpreter Lock: Episode I - Break the SealTzung-Bi Shih
 
Multinomial Logistic Regression with Apache Spark
Multinomial Logistic Regression with Apache SparkMultinomial Logistic Regression with Apache Spark
Multinomial Logistic Regression with Apache SparkDB Tsai
 
Alpine Spark Implementation - Technical
Alpine Spark Implementation - TechnicalAlpine Spark Implementation - Technical
Alpine Spark Implementation - Technicalalpinedatalabs
 
Subtle Asynchrony by Jeff Hammond
Subtle Asynchrony by Jeff HammondSubtle Asynchrony by Jeff Hammond
Subtle Asynchrony by Jeff HammondPatrick Diehl
 
Доклад Антона Поварова "Go in Badoo" с Golang Meetup
Доклад Антона Поварова "Go in Badoo" с Golang MeetupДоклад Антона Поварова "Go in Badoo" с Golang Meetup
Доклад Антона Поварова "Go in Badoo" с Golang MeetupBadoo Development
 
Job Queue in Golang
Job Queue in GolangJob Queue in Golang
Job Queue in GolangBo-Yi Wu
 
Concurrency in go
Concurrency in goConcurrency in go
Concurrency in goborderj
 
The Ring programming language version 1.5.3 book - Part 89 of 184
The Ring programming language version 1.5.3 book - Part 89 of 184The Ring programming language version 1.5.3 book - Part 89 of 184
The Ring programming language version 1.5.3 book - Part 89 of 184Mahmoud Samir Fayed
 
Ehsan parallel accelerator-dec2015
Ehsan parallel accelerator-dec2015Ehsan parallel accelerator-dec2015
Ehsan parallel accelerator-dec2015Christian Peel
 
Java 8 - functional features
Java 8 - functional featuresJava 8 - functional features
Java 8 - functional featuresRafal Rybacki
 
Postgres Vision 2018: Making Postgres Even Faster
Postgres Vision 2018: Making Postgres Even FasterPostgres Vision 2018: Making Postgres Even Faster
Postgres Vision 2018: Making Postgres Even FasterEDB
 
Global Interpreter Lock: Episode III - cat &lt; /dev/zero > GIL;
Global Interpreter Lock: Episode III - cat &lt; /dev/zero > GIL;Global Interpreter Lock: Episode III - cat &lt; /dev/zero > GIL;
Global Interpreter Lock: Episode III - cat &lt; /dev/zero > GIL;Tzung-Bi Shih
 
Ruby Functional Programming
Ruby Functional ProgrammingRuby Functional Programming
Ruby Functional ProgrammingGeison Goes
 
Multiprocessing with python
Multiprocessing with pythonMultiprocessing with python
Multiprocessing with pythonPatrick Vergain
 

Similaire à Scalability comparison: Traditional fork-join-based parallelism vs. Goroutines: Porting the Barcelona OpenMP Tasks Suite to Go (20)

Euro python2011 High Performance Python
Euro python2011 High Performance PythonEuro python2011 High Performance Python
Euro python2011 High Performance Python
 
OmpSs – improving the scalability of OpenMP
OmpSs – improving the scalability of OpenMPOmpSs – improving the scalability of OpenMP
OmpSs – improving the scalability of OpenMP
 
Global Interpreter Lock: Episode I - Break the Seal
Global Interpreter Lock: Episode I - Break the SealGlobal Interpreter Lock: Episode I - Break the Seal
Global Interpreter Lock: Episode I - Break the Seal
 
Multinomial Logistic Regression with Apache Spark
Multinomial Logistic Regression with Apache SparkMultinomial Logistic Regression with Apache Spark
Multinomial Logistic Regression with Apache Spark
 
Alpine Spark Implementation - Technical
Alpine Spark Implementation - TechnicalAlpine Spark Implementation - Technical
Alpine Spark Implementation - Technical
 
Subtle Asynchrony by Jeff Hammond
Subtle Asynchrony by Jeff HammondSubtle Asynchrony by Jeff Hammond
Subtle Asynchrony by Jeff Hammond
 
Доклад Антона Поварова "Go in Badoo" с Golang Meetup
Доклад Антона Поварова "Go in Badoo" с Golang MeetupДоклад Антона Поварова "Go in Badoo" с Golang Meetup
Доклад Антона Поварова "Go in Badoo" с Golang Meetup
 
CS4961-L9.ppt
CS4961-L9.pptCS4961-L9.ppt
CS4961-L9.ppt
 
Job Queue in Golang
Job Queue in GolangJob Queue in Golang
Job Queue in Golang
 
Functional go
Functional goFunctional go
Functional go
 
Concurrency in go
Concurrency in goConcurrency in go
Concurrency in go
 
Matlab ppt
Matlab pptMatlab ppt
Matlab ppt
 
The Ring programming language version 1.5.3 book - Part 89 of 184
The Ring programming language version 1.5.3 book - Part 89 of 184The Ring programming language version 1.5.3 book - Part 89 of 184
The Ring programming language version 1.5.3 book - Part 89 of 184
 
Ehsan parallel accelerator-dec2015
Ehsan parallel accelerator-dec2015Ehsan parallel accelerator-dec2015
Ehsan parallel accelerator-dec2015
 
Java 8 - functional features
Java 8 - functional featuresJava 8 - functional features
Java 8 - functional features
 
Postgres Vision 2018: Making Postgres Even Faster
Postgres Vision 2018: Making Postgres Even FasterPostgres Vision 2018: Making Postgres Even Faster
Postgres Vision 2018: Making Postgres Even Faster
 
Global Interpreter Lock: Episode III - cat &lt; /dev/zero > GIL;
Global Interpreter Lock: Episode III - cat &lt; /dev/zero > GIL;Global Interpreter Lock: Episode III - cat &lt; /dev/zero > GIL;
Global Interpreter Lock: Episode III - cat &lt; /dev/zero > GIL;
 
Ruby Functional Programming
Ruby Functional ProgrammingRuby Functional Programming
Ruby Functional Programming
 
Multiprocessing with python
Multiprocessing with pythonMultiprocessing with python
Multiprocessing with python
 
Profiling in Python
Profiling in PythonProfiling in Python
Profiling in Python
 

Dernier

Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...ranjana rawat
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)simmis5
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdfankushspencer015
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college projectTonystark477637
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdfKamal Acharya
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdfKamal Acharya
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfKamal Acharya
 
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsRussian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTINGMANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTINGSIVASHANKAR N
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 

Dernier (20)

Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdf
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsRussian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTINGMANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 

Scalability comparison: Traditional fork-join-based parallelism vs. Goroutines: Porting the Barcelona OpenMP Tasks Suite to Go

  • 1. Scalability comparison: Traditional fork-join-based parallelism vs. Goroutines Porting the Barcelona OpenMP Tasks Suite to Go Artjom Simon https://github.com/artjomsimon/go-bots Know Your Gophers 2015-05-12
  • 2. Traditional approach in C Cilk: cilk_spawn task(); [...] cilk_sync; OpenMP: #pragma omp parallel { #pragma omp task [...] #pragma omp taskwait [...] }
  • 3. Go: Parallel For Loop Pattern1 queue := make(chan int) done := make(chan bool) NP := runtime.GOMAXPROCS(0) go func() { for i := 0; i < n; i++ { queue <- i } close(queue) }() for i := 0; i < NP; i++ { go func() { for i := range queue { work(i) } done<-true }() } for i := 0; i < NP; i++ { <-done } 1 Benchmarking Usability and Performance of Multicore Languages, PDF: http://arxiv.org/pdf/1302.2837v2
  • 4. Barcelona OpenMP Tasks Suite2 2 https://github.com/alcides/bots
  • 5. ...used in academic publications3 3 http://www.sarc-ip.org/files/null/Workshop/1234128788173_ _TSchedStrat-iwomp08.pdf
  • 6. Micro benchmarks 1 8 16 32 48 1 8 16 32 48 OMP_NUM_THREADS Speeduprel.toseq. spc (opteron) n=1000µs n=100µs n=10µs Figure: Speedup spc (icc), 10 000 Tasks
  • 7. Task pools: Variations • notaskpool Start Goroutines as needed, no limitation, uses WaitGroup for synchronization • simple-queue Buffered channel of func()s holds task queue. n goroutines receive the func()s and execute them • goroutines-dispatcher Dispatcher function, executing tasks in Goroutine only if a global counter of running goroutines is < n • const-goroutines n goroutines remove tasks from a double-linked list
  • 8. Micro benchmarks 1 8 16 32 48 1 8 16 32 48 OMP_NUM_THREADS Speeduprel.zusequentiell spc (opteron) gcc icc clang go-notaskpool go-simple-queue go-const-goroutines go-goroutine-dispatch Figure: Speedup spc, n=100µs, 10 000 Tasks
  • 9. BOTS: nqueens • N-Queens problem with n=12 • Recursive backtracking search • No cut-off when creating tasks
  • 10. Ergebnisse: BOTS (nqueens) 1 8 16 32 48 0 5 10 CPU cores Speedup nqueens (opteron) gcc icc clang go-const-goroutines go-dispatch go-notaskpool gccgo-const-goroutines gccgo-dispatch gccgo-notaskpool Figure: Speedup for nqueens -n 12, parallel
  • 11. BOTS: sparselu • LU factorization of a sparse block matrix • 50x50-Matrix, 100x100 sub block matrices
  • 12. Results: BOTS (sparselu) 1 8 16 32 48 0 10 20 30 CPU cores Speedup sparselu (opteron) gcc icc clang go-const-goroutines go-dispatch go-notaskpool go-simplequeue gccgo-const-goroutines gccgo-dispatch gccgo-notaskpool Figure: Speedup sparselu -n 50 -m 100, parallel
  • 14. Memory opteron 0 0.5 1 1.5 ·105 RSS[Kbytes] spc-par 10000 1000, 4 Threads gcc icc clang go-notaskpool go-const-goroutines gccgo-notaskpool gccgo-const-goroutines Figure: Memory comparison (Resident Set Size), spc parallel
  • 15. Side effect: Possible heap corruption bug in Go 1.4?
  • 18. Image credits Icon N-Queens problem: Colin M.L. Burnett, Wikimedia Commons, (GFDL & BSD & GPL) http://commons.wikimedia.org/wiki/File:Chess_d45.svg (2015-03-09)