Java 8 Streams And Common Operations By Harmeet Singh(Taara)
Concurrency Grabbag
1. Concurrency Grab Bag More Gotchas, Tips, and Patterns for Practical Concurrency Sangjin Lee & DebashisSaha eBay Inc.
2. Agenda Introduction Patterns & anti-patterns Warm-up: “double-checked locking” on collections Many readers, few writers Many writers, few readers Bonus: configuring a ThreadPoolExecutor Closing... 2
3. Introduction The main goal is two-fold: correctness first, and performance/scalability next Problems tend to repeat themselves: anti-patterns work as visual “crutches” to spot bad smell 3
4. Agenda Introduction Patterns & anti-patterns Warm-up: “double-checked locking” on collection Many readers, few writers Many writers, few readers Bonus: configuring a ThreadPoolExecutor Closing... 4
5. “Double-checked locking” on collection Initialize a collection lazily class Unsafe { private Map<String,Object> map = null; public void useMap() { if (map == null) { initMap(); } // read the map; get(), iterate, ... } privatesynchronized void initMap() { if (map == null) { map = new HashMap<String,Object>(); // populate the map with initial data } } } 5
6. “Double-checked locking” on collection It’s worse than the real double-checked locking pattern Why would one do this? Delay the expensive operation of populating the data You don’t want to incur penalty on reads: once the map is set up, it’s read-only But is laziness really necessary? 6
7. “Double-checked locking” on collection “Eager” fix class Safe { privatefinal Map<String,Object> map; public Safe() { map = new HashMap<String,Object>(); // populate the map with initial data } public void useMap() { // read the map; get(), iterate, ... } } 7
8. “Double-checked locking” on collection Fix using volatile if the data is optional & large class Safe { privatevolatile Map<String,Object> map = null; public void useMap() { if (map == null) { initMap(); } // read the map; get(), iterate, ... } private synchronized void initMap() { if (map == null) { Map<String,Object> temp = new HashMap<String,Object>(); // populate temp with initial data map = temp; // make it available after it’s ready } } } 8
9. Agenda Introduction Patterns & anti-patterns Warm-up: “double-checked locking” on collections Many readers, few writers Many writers, few readers Bonus: configuring a ThreadPoolExecutor Closing... 9
10. Many readers, few writers Use cases: change data only on demand (e.g. configuration), ... Implementation choices Synchronized data structure Concurrent collections (e.g. ConcurrentHashMap) ReadWriteLock “Copy-on-write” 10
11. Many readers, few writers Example: using synchronization class Synchronized { private final List<String> list = new ArrayList<String>(); // the entire iteration must be synchronized public synchronized void iterateOnList() { for (String s: list) { // do something with s } } publicsynchronized void add(String value) { list.add(value); } } 11
12. Many readers, few writers Example: using ReadWriteLock class UsingReadWriteLock { private final List<String> list = new ArrayList<String>(); private final ReadWriteLock lock = new ReentrantReadWriteLock(); public void iterateOnList() { lock.readLock().lock(); try { for (String s: list) { // do something with s } } finally { lock.readLock().unlock(); } } // continued... 12
13. Many readers, few writers Example: using ReadWriteLock // continued public void add(String value) { lock.writeLock().lock(); try { list.add(value); } finally { lock.writeLock().unlock(); } } } 13
14. Many readers, few writers Copy-on-write If writes are truly few and far between, and you want reads to be as fast as possible, copy-on-write is an option You copy and replace the entire data on every write You eliminate synchronization on reads, and shift the burden to writes Writes usually become much more expensive example: java.util.concurrent.CopyOnWriteArrayList 14
15. Many readers, few writers Example: using copy-on-write class CopyOnWrite { privatevolatile List<String> list = new ArrayList<String>(); public void iterateOnList() { // no locking needed for (String s: list) { // do something with s } } publicsynchronized void add(String value) { // need mutual exclusion List<String> copy = new ArrayList<String>(list); // create a copy copy.add(value); list = copy; } } 15
16. Many readers, few writers What’s wrong with this? class BadCopyOnWrite { private volatile List<String> list = new ArrayList<String>(); public void iterateOnList() { // no locking needed for (inti = 0; i < list.size(); i++) { String s = list.get(i); // do something with s } } publicsynchronized void add(String value) { // need mutual exclusion List<String> copy = new ArrayList<String>(list); // create a copy copy.add(value); list = copy; } } 16
17. Many readers, few writers Of course you can simply use CopyOnWriteArrayList! class CopyOnWrite2 { private final List<String> list = new CopyOnWriteArrayList<String>(); public void iterateOnList() { // no locking needed for (String s: list) { // do something with s } } public void add(String value) { list.add(value); } } 17
19. Many readers, few writers For Maps, copy-on-write is less useful as ConcurrentHashMap is usually good enough ReadWriteLock is an option, but is less concurrent than and performs more poorly than ConcurrentHashMap Copy-on-write has the best read performance 19
20. Many readers, few writers Copy-on-write: caveats The write performance The staleness behavior should be acceptable (it usually is) The direct reference to the underlying data that is copied should not escape the object Stale data Memory leaks 20
21. Many readers, few writers What should we use? If the (read) concurrency is low, synchronization is often good enough Choose concurrent collections (ConcurrentHashMap, etc.) if applicable Use copy-on-write if concurrent collections are not applicable and write performance is not a concern 21
22. Many readers, few writers How about copy-on-write on MULTIPLE variables? 22
23. Many readers, few writers Multi-variable example: using synchronization class Synchronized { private Map<String,String> current = new HashMap<String,String>(); private Map<String,String> previous = null; public synchronized void shift() { previous = current; current = new HashMap<String,String>(); } public synchronized void putValue(String key, String value) { current.put(key, value); } public synchronized void getValue(String key) { return current.get(key); } } 23
24. Many readers, few writers Copy-on-write on multiple variables Use a container class with those variables Do a volatile copy-and-replace with the container object 24
25. Many readers, few writers Multi-variable example: use a container class class ShiftingWindow { final Map<String,String> current; final Map<String,String> previous; public ShiftingWindow(Map<String,String> c, Map<String,String> p) { current = c; previous = p; } } 25
26. Many readers, few writers Multi-variable example: use a container class class CopyOnWrite { privatevolatileShiftingWindow window = new ShiftingWindow(newConcurrentHashMap<String,String>(), null); public synchronized void shift() { // copy on write ShiftingWindownewWindow = new ShiftingWindow(newConcurrentHashMap<String,String>(), window.current); window = newWindow; } public void putValue(String key, String value) { // no locking window.current.put(key, value); } public void getValue(String key) { // no locking return window.current.get(key); } } 26
27. Agenda Introduction Patterns & anti-patterns Warm-up: “double-checked locking” on collections Many readers, few writers Many writers, few readers Bonus: configuring a ThreadPoolExecutor Closing... 27
28. Many writers, few readers Use cases: logging, counters, statistics, ... Produce secondary data (e.g. URL counts) from primary operations (serving URLs) Many writers: all servlet threads will update the data frequently Few readers: the data will be read on demand (reporting) or periodically Impact on the primary operations must be minimized 28
29. Many writers, few readers Implementation choices Synchronized data structure ConcurrentHashMap (for a map or set) Asynchronous (background) processor 29
30. Many writers, few readers Synchronized data structure Not recommended Can induce a hotly contended lock under high level of concurrency, and turn into a scalability hot spot ConcurrentHashMap Normally the best solution Scales well under high level of concurrency Asynchronous (background) processor Useful pattern if ConcurrentHashMap is not an option or write operations are serial in nature 30
31. Many writers, few readers Synchronized data structure class SynchronizedCounter { private final Map<String,Integer> map = new HashMap<String,Integer>(); publicsynchronized void addCount(String page) { Integer value = map.get(page); value = (value == null) ? 1 : value+1; map.put(page, value); } public synchronizedintgetCount(String page) { Integer value = map.get(page); return (value == null) ? 0 : value; } } 31
32. Many writers, few readers ConcurrentHashMap class ConcurrentHashMapCounter { private final ConcurrentMap<String,AtomicInteger> map = new ConcurrentHashMap<String,AtomicInteger>(); public void addCount(String page) { AtomicInteger value = map.get(page); if (value == null) { value = new AtomicInteger(0); AtomicInteger old = map.putIfAbsent(page, value); if (old != null) { value = old; } } value.incrementAndGet(); } // continued... 32
33. Many writers, few readers ConcurrentHashMap // continued publicintgetCount(String page) { AtomicInteger value = map.get(page); return (value == null) ? 0 : value.get(); } } 33
34. Many writers, few readers Asynchronous (background) processor A single background processor thread owns the data Primary threads produce tasks for the background processor Writes and reads are actually done on the background processor thread 34
35. Many writers, few readers Asynchronous (background) processor: benefits Latency on the primary threads is minimized Contention is greatly reduced: can yield much better throughput than synchronization Trivially thread safe: exploits safety via thread confinement Example: logging to disk/console 35
36. Many writers, few readers Asynchronous (background) processor: caveats The data structure should not escape the background thread The actual tasks should be thread-agnostic Performs poorly against a more concurrent solution Code becomes bit more complicated You need to manage saturation: tasks may be produced faster than they can be handled by the processor 36
37. Many writers, few readers Asynchronous (background) processor class BackgroundCounter { // background thread private final ExecutorService executor = Executors.newSingleThreadExecutor(); // map is exclusively used by the executor thread private final Map<String,Integer> map = new HashMap<String,Integer>(); public void addCount(String page) { executor.execute(newAddTask(page)); } public intgetCount(String page) { Future<Integer> future = executor.submit(newGetTask(page)); return future.get(); // exception handling omitted } // continued... 37
38. Many writers, few readers Asynchronous (background) processor // continued private class AddTask implements Runnable { private final String page; AddTask(String page) { this.page = page; } public void run() { Integer value = map.get(page); value = (value == null) ? 1 : value+1; map.put(page, value); } } // continued... 38
39. Many writers, few readers Asynchronous (background) processor // continued private class GetTask implements Callable<Integer> { private final String page; GetTask(String page) { this.key = page; } public Integer call() { Integer value = map.get(page); return (value == null) ? 0 : value; } } } 39
40. Agenda Introduction Patterns & anti-patterns Warm-up: “double-checked locking” on collections Many readers, few writers Many writers, few readers Bonus: configuring a ThreadPoolExecutor Closing... 40
41. Configuring a ThreadPoolExecutor Right configuration that fits your use case and demand is extremely important Badly configured ThreadPoolExecutors cause exceptions and performance issues RejectedExecutionExceptions anyone? 41
42. Configuring a ThreadPoolExecutor Simple rules for ThreadPoolExecutor behavior When a task is submitted: If the core size has not been reached, a new thread is always created If the core size is reached, the task is queued If the core size is reached and the queue becomes full, a new thread is created until the max size is reached If the max size is reached and the queue is full, the rejected execution policy kicks in 42
43. Configuring a ThreadPoolExecutor Importance of core size ThreadPoolExecutor changes behavior dramatically around the core size Below core size, threads are always created even if there are idle threads Above core size, the preferred behavior shifts to queuing Core size should be big enough to accommodate the anticipated average task throughput demand 43
44. Configuring a ThreadPoolExecutor Thread pool size and queue size are competing parameters Queuing increases latency but conserves resource A queued task in general consumes less resource than an active task 44
45. Closing... Power of static analysis Whenever we find an issue, we try to turn it into a static analysis rule FindBugs already has many useful thread-safety rules Intent is the most difficult part with thread-safety analysis: annotations help Continued training helps as well 45
48. TPE: Cancelling tasks Cancelling tasks: more complicated than you think Cancelling tasks is your job Timing out from Future.get() does NOT cancel the task by itself Some TPE methods cancel outstanding tasks for you: invokeAll() with timeout, invokeAny() Cancelling tasks uses interruption: you should write your task to respond to cancellation promptly (i.e. “interruptible”) 48
50. TPE & UncaughtExceptionHandler Multi-threaded test with vanilla thread class TestWithThreads extends TestCase { @Test public void test() { MyHandlerh = new MyHandler(); Thread th = new Thread(someRunnable); th.setUncaughtExceptionHandler(h); th.start(); th.join(); // check MyHandler for any exception on thread th } private static class MyHandler implements UncaughtExceptionHandler { public void uncaughtException(Threadt, Throwablee) { // store the exception } } } 50
51. TPE & UncaughtExceptionHandler Multi-threaded test with TPE stops working: why? class BrokenTestWithExecutor extends TestCase { private ExecutorService executor = Executors.newSingleThreadExecutor(); @Test public void test() { MyHandlerh = new MyHandler(); Thread.setDefaultUncaughtExceptionHandler(h); executor.submit(someRunnable).get(); // check MyHandler for any exception on thread th } private static class MyHandler implements UncaughtExceptionHandler { public void uncaughtException(Threadt, Throwablee) { // store the exception } } } 51
52. TPE & UncaughtExceptionHandler Remember what UncaughtExceptionHandlers are for! UncaughtExceptionHandlers are invoked only if the thread is being terminated due to an uncaught exception Some (not all) TPE methods catch and handle all exceptions ThreadPoolExecutor execute(): triggers UncaughtExceptionHandlers submit(): does not trigger them ScheduledThreadPoolExecutor: does not trigger them 52
53. TPE & UncaughtExceptionHandler Simply don’t rely on UncaughtExceptionHandlers with TPE Using Future and ExecutionException is the right way with TPE 53
54. TPE & UncaughtExceptionHandler Multi-threaded test with TPE: correct class CorrectTestWithExecutor extends TestCase { private ExecutorService executor = Executors.newSingleThreadExecutor(); @Test public void test() { try { executor.submit(someRunnable).get(); } catch (ExecutionExceptione) { // its cause is the original exception Throwable cause = e.getCause(); // assert failure } catch (InterruptedException e2) { ... } } } 54