Scalability comparison: Traditional fork-join-based parallelism vs. Goroutines: Porting the Barcelona OpenMP Tasks Suite to Go
1. Scalability comparison: Traditional fork-join-based
parallelism vs. Goroutines
Porting the Barcelona OpenMP Tasks Suite to Go
Artjom Simon
https://github.com/artjomsimon/go-bots
Know Your Gophers
2015-05-12
2. Traditional approach in C
Cilk:
cilk_spawn task();
[...]
cilk_sync;
OpenMP:
#pragma omp parallel
{
#pragma omp task
[...]
#pragma omp taskwait
[...]
}
3. Go: Parallel For Loop Pattern1
queue := make(chan int)
done := make(chan bool)
NP := runtime.GOMAXPROCS(0)
go func() {
for i := 0; i < n; i++ { queue <- i }
close(queue)
}()
for i := 0; i < NP; i++ {
go func() {
for i := range queue { work(i) }
done<-true
}()
}
for i := 0; i < NP; i++ { <-done }
1
Benchmarking Usability and Performance of Multicore Languages, PDF:
http://arxiv.org/pdf/1302.2837v2
7. Task pools: Variations
• notaskpool
Start Goroutines as needed, no limitation, uses WaitGroup for
synchronization
• simple-queue
Buffered channel of func()s holds task queue. n goroutines
receive the func()s and execute them
• goroutines-dispatcher
Dispatcher function, executing tasks in Goroutine only if a
global counter of running goroutines is < n
• const-goroutines
n goroutines remove tasks from a double-linked list