14. Examples
• Thread APIs: concurrency
• Actore APIs: concurrency
• Native thread, process: parallelism
• If the underlying system supports it
• SIMD, GPU, vector operations: parallelism
18. You Need Both
• Work that can split into concurrent jobs
• Platform that runs those jobs in parallel
• In an ideal world, scales with job count
• In our world, each job adds overhead
19. Process-level
Concurrency
• Separate processes running concurrently
• As parallel as OS/CPU can make them
• Low risk due to isolated memory space
• High memory requirements
• High communication overhead
20. Thread-level
Concurrency
• Threads in-process running concurrently
• As parallel as OS/CPU can make them
• Higher risk due to shared memory space
• Lower memory requirements
• Low communication overhead
21. Popular Platforms
Concurrency Parallelism GC Notes
MRI 1.8.7 ✔ ✘ Single thread, stop-
the-world
Large C core would
need much work
MRI 1.9+ ✔ ✘ Single thread, stop-
the-world
Few changes since
1.9.3
JRuby (JVM) ✔ ✔ Many concurrent
and parallel options
JVM is the “best”
platform for conc
Rubinius ✔ ✔
Single thread, stop-
the-world, partial
concurrent old gen
Promising, but a
long road ahead
Topaz ✘ ✘ Single thread, stop-
the-world
Incomplete impl
Node.js (V8) ✘ ✘ Single thread, stop-
the-world
No threads in JS
CPython ✔ ✘ Reference-counting
Reference counting
kills parallelism
Pypy ✔ ✘ Single thread, stop-
the-world
Exploring STM to
enable concurrency
26. Timeslicing
Thread 1
Thread 2
Thread 3
Thread 4
Native thread
Native thread
Native thread
Native thread
“Green” or “virtual” or “userspace” threads share
a single native thread.The CPU then schedules
that thread on available CPUs.
Time’s up
Time’s up Time’s up
29. GVL: GlobalVM Lock
Thread 1
Thread 2
Thread 3
Thread 4
CPU
CPU
CPU
CPU
In 1.9+, each thread gets its own native thread,
but a global lock prevents concurrent execution.
Time slices are finer grained and variable, but
threads still can’t run in parallel.
Lock xfer
CPU
Lock xfer Lock xfer
34. Why Do We See
Parallelism?
• Hotspot JVM has many background threads
• GC with concurrent and parallel options
• JIT threads
• Signal handling
• Monitoring and management
38. Rules of Concurrency
1. Don’t do it, if you don’t have to.
2. If you must do it, don’t share data.
3. If you must share data, make it immutable.
4. If it must be mutable, coordinate all access.
39. #1: Don’t
• Many problems won’t benefit
• Explicitly sequential things, e.g
• Bad code can get worse
• Multiply perf, GC, alloc overhead by N
• Fixes may not be easy (esp. in Ruby)
• The risks can get tricky to address
44. I’m Not Perfect
• Wrote a naive algorithm
• Measured it taking N seconds
• Wrote the concurrent version
• Measured it taking roughly N seconds
• Returned to original to optimize
47. Before Conc Work
• Fix excessive allocation (and GC)
• Fix algorithmic complexity
• Test on the runtime you want to target
• If serial perf is still poor after optimization,
the task, runtime, or system may not be
appropriate for a concurrent version.
49. #2: Don’t Share Data
• Process-level concurrency
• …have to sync up eventually, though
• Threads with their own data objects
• Rails request objects, e.g.
• APIs with a “master” object, usually
• Weakest form of concurrency
50. #3: Immutable Data
• In other words…
• Data can be shared
• Threads can pass it around safely
• Cross-thread view of data can’t mutate
• Threads can’t see concurrent mutations as
they happen, avoiding data races
51. Object#freeze
• Simplest mechanism for immutability
• For read-only: make changes, freeze
• Read-mostly: dup, change, freeze, replace
• Write-mostly: same, but O(n) complexity
52. Immutable Data
Structure
• Designed to avoid visible mutation but still
have good performance characteristics
• Copy-on-write is poor-man’s IDS
• Better: persistent data structures like Ctrie
http://en.wikipedia.org/wiki/Ctrie
53. Persistent?
• Collection you have a reference to is
guaranteed never to change
• Modifications return a new reference
• …and only duplicate affected part of trie
54. Hamster
• Pure-Ruby persistent data structures
• Set, List, Stack, Queue,Vector, Hash
• Based on Clojure’s Ctrie collections
• https://github.com/hamstergem/hamster
56. Coming Soon
• Reimplementation by Smit Shah
• Mostly “native” impl of Ctrie
• Considerably better perf than Hamster
• https://github.com/Who828/persistent_data_structures
57. Other Techniques
• Known-immutable data like Symbol, Fixnum
• Mutate for a while, then freeze
• Hand-off: if you pass mutable data, assume
you can’t mutate it anymore
• Sometimes enforced by runtime, e.g.
“thread-owned objects”
58. #4: Synchronize Mutation
• Trickiest to get right; usually best perf
• Fully-immutable generates lots of garbage
• Locks, atomics, and specialized collections
59. Locks
• Avoid concurrent operations
• Read + write, in general
• Many varieties: reentrant, read/write
• Many implementations
60. Mutex
• Simplest form of lock
• Acquire, do work, release
• Not reentrant
semaphore = Mutex.new!
...!
a = Thread.new {!
semaphore.synchronize {!
# access shared resource!
}!
}
61. ConditionVariable
• Release mutex temporarily
• Signal others waiting on the mutex
• …and be signaled
• Similar to wait/notify/notifyAll in Java
62. resource = ConditionVariable.new!
!
a = Thread.new {!
mutex.synchronize {!
# Thread 'a' now needs the resource!
resource.wait(mutex)!
# 'a' can now have the resource!
}!
}!
!
b = Thread.new {!
mutex.synchronize {!
# Thread 'b' has finished using the resource!
resource.signal!
}!
}!
66. Atomics
• Without locking…
• …replace a value only if unchanged
• …increment, decrement safely
• Thread-safe code can use atomics instead
of locks, usually with better performance
72. threads = thread_count.times.map do |i|!
Thread.new do!
while true!
words = queue.pop!
if words.nil? # terminating condition!
queue.shutdown!
break!
end!
words.each do |word|!
# analyze the word
73. Putting It All Together
• These are a lot of tools to sort out
• Others have sorted them out for you
74. Celluloid
• Actor model implementation
• OO/Ruby sensibilities
• Normal classes, normal method calls
• Async support
• Growing ecosystem
• Celluloid-IO and DCell (distributed actors)
• https://github.com/celluloid/celluloid
75. class Sheen!
include Celluloid!
!
def initialize(name)!
@name = name!
end!
!
def set_status(status)!
@status = status!
end!
!
def report!
"#{@name} is #{@status}"!
end!
end
77. Sidekiq
• Simple, efficient background processing
• Think Resque or DelayedJob but better
• Normal-looking Ruby class is the job
• Simple call to start it running in background
• http://mperham.github.io/sidekiq/