Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
UM Amherst OS Lecture on GC
1. Operating Systems
CMPSCI 377
Garbage Collection
Emery Berger
University of Massachusetts Amherst
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
2. Questions To Come
Terms:
Tracing
Copying
Conservative
Parallel GC
Concurrent GC
Runtime & space costs
Live objects
Reachable objects
References
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 2
3. Explicit Memory Management
malloc/new
allocates space for an object
free/delete
returns memory to system
Simple, but tricky to get right
Forget to free memory leak
free too soon “dangling pointer”
Double free, invalid free...
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
4. Dangling Pointers
Node x = new Node (“happy”);
Node ptr = x;
delete x; // But I’m not dead yet!
Node y = new Node (“sad”);
cout << ptr->data << endl; // sad
Insidious, hard-to-track down bugs
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
5. Solution: Garbage Collection
Garbage collector periodically scans
objects on heap
No need to free
Automatic memory management
Reclaims non-reachable objects
Won’t reclaim objects until they’re dead
(actually somewhat later)
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
6. No More Dangling Pointers
Node x = new Node (“happy”);
Node ptr = x;
// x still live (reachable through ptr)
Node y = new Node (“sad”);
cout << ptr->data << endl; // happy!
So why not use GC
all the time?
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
7. Because This Guy Says Not To
There just aren’t all
GC sucks donkey brains
that many worse ways
through a straw from a
to f*** up your cache
performance standpoint.
behavior than by using
lots of allocations and
lazy GC to manage your
memory.
Linus
Torvalds
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
8. Slightly More Technically…
“GC impairs performance”
Extra processing
collection, copying
Degrades cache performance (ibid)
Degrades page locality (ibid)
Increases memory needs
delayed reclamation
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
9. On the other hand…
No, “GC enhances performance!”
Faster allocation
pointer-bumping vs. freelists
Improves cache performance
no need for headers
Better locality
can reduce fragmentation, compact data
structures according to use
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
10. Outline
Classical GC algorithms
Quantifying GC performance
A hard problem
Oracular memory management
GC vs. malloc/free bakeoff
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
11. Classical Algorithms
Three classical algorithms
Mark-sweep
Reference counting
Semispace
Tweaks
Generational garbage collection
Out of scope
Parallel – perform GC in parallel
Concurrent – run GC at same time as app
Real-time – ensure bounded pause times
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 11
12. Mark-Sweep
Start with roots
Global variables, variables on stack
& in registers
Recursively visit every object through
pointers
Mark every object we find (set mark bit)
Everything not marked = garbage
Can then sweep heap for unmarked objects
and free them
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 12
13. Mark-Sweep Example
roots
global 1 object 1 Initially,
global 2 object 2 all objects white
global 3 object 3 (garbage)
object 4
object 5
object 6
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 13
14. Mark-Sweep Example
roots
global 1 object 1 Initially,
global 2 object 2 all objects white
global 3 object 3 (garbage)
object 4 Visit objects,
object 5 following object
object 6 graph
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 14
15. Mark-Sweep Example
roots
global 1 object 1 Initially,
global 2 object 2 all objects white
global 3 object 3 (garbage)
object 4 Visit objects,
object 5 following object
object 6 graph
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 15
16. Mark-Sweep Example
roots
global 1 object 1 Initially,
global 2 object 2 all objects white
global 3 object 3 (garbage)
object 4 Visit objects,
object 5 following object
object 6 graph
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 16
17. Mark-Sweep Example
roots
global 1 object 1 Initially,
global 2 object 2 all objects white
global 3 object 3 (garbage)
object 4 Visit objects,
freelist
object 5 following object
object 6 graph
Can sweep
immediately or
lazily
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 17
18. Reference Counting
For every object, maintain reference count
= number of incoming pointers
a->ptr = x refcount(x)++
a->ptr = y refcount(x)--;
refcount(y)++
Reference count = 0
no more incoming pointers: garbage
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 18
19. Reference Counting Example
roots
global 1 New objects:
object 2
global 2
2
ref count = 1
global 3
object 4
1
object 5
1
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 19
20. Reference Counting Example
roots
global 1 New objects:
object 2
global 2
1
ref count = 1
global 3
object 4 Delete pointer:
1 refcount--
object 5
1
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 20
21. Reference Counting Example
roots
global 1 New objects:
object 2
global 2
1
ref count = 1
global 3
object 4 Delete pointer:
1 refcount--
object 5
1
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 21
22. Reference Counting Example
roots
global 1 New objects:
object 2
global 2
1
ref count = 1
global 3
object 4 Delete pointer:
1 refcount--
object 5
And recursively
0
delete pointers
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 22
23. Reference Counting Example
roots
global 1 New objects:
object 2
global 2
1
ref count = 1
global 3
object 4 Delete pointer:
0 refcount--
object 5
And recursively
0
delete pointers
refcount == 0:
put on freelist
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 23
24. Cycles & Reference Counting
roots
global 1 Big problem: cycles
object 2
global 2
2
global 3
object 4
1
object 5
2
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 24
25. Reference Counting Example
roots
global 1 Big problem: cycles
object 2
global 2
2
global 3
object 4
1
object 5
1
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 25
26. Reference Counting Example
roots
global 1 Big problem: cycles
object 2
global 2 Cycles lead to
2
global 3
object 4 unreclaimable
1 garbage
object 5
Need to do periodic
1
tracing collection
(e.g., mark-sweep)
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 26
27. Semispace
Divide heap in two semispaces:
Allocate objects from from-space
When from-space fills,
Scan from roots through live objects
Copy them into to-space
When done, switch the spaces
Allocate from leftover part of to-space
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 27
28. Semispace Example
from-space Allocate in from-
space
Pointer bumping
to-space
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 28
29. Semispace Example
from-space Allocate in from-
space
Pointer bumping
to-space
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 29
30. Semispace Example
from-space Allocate in from-
space
Pointer bumping
to-space
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 30
31. Semispace Example
from-space Allocate in from-
space
Pointer bumping
Copy live objects
into to-space
to-space
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 31
32. Semispace Example
from-space Allocate in from-
space
Pointer bumping
Copy live objects
into to-space
to-space
Leaves
forwarding
pointer
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 32
33. Semispace Example
from-space Allocate in from-
space
Pointer bumping
Copy live objects
into to-space
to-space
Leaves
forwarding
pointer
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 33
34. Semispace Example
from-space Allocate in from-
space
Pointer bumping
Copy live objects
into to-space
to-space
Leaves
forwarding
pointer
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 34
35. Semispace Example
to-space Allocate in from-
space
Pointer bumping
Copy live objects
into to-space
from-space
Leaves
forwarding
pointer
Flip spaces;
allocate from end
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 35
36. Generational GC
Optimization for copying collectors
Generational hypothesis:
“most objects die young”
Common-case optimization
Allocate into nursery
Small region
Collect frequently
Copy out survivors
Key idea: keep track of pointers from
mature space into nursery
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 36
37. Generational GC Example
mature space
nursery
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 37
38. Generational GC Example
mature space
nursery
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 38
39. Generational GC Example
mature space
nursery
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 39
40. Generational GC Example
mature space Copy out survivors
(via roots &
mature space
pointers)
nursery
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 40
41. Generational GC Example
mature space Copy out survivors
(via roots &
mature space
pointers)
nursery
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 41
42. Generational GC Example
mature space Copy out survivors
(via roots &
mature space
pointers)
nursery
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 42
43. Generational GC Example
mature space Copy out survivors
(via roots &
mature space
pointers)
Reset allocation
nursery
pointer & continue
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 43
44. Conservative GC
Non-copying collectors for C & C++
Must identify pointers
“Duck test”: if it looks like a pointer, it’s a pointer
Trace through “pointers”, marking everything
Can link with Boehm-Demers-Weiser
library (“libgc”)
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 44
45. GC vs. malloc/free
45
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
46. Comparing Memory Managers
Node v = malloc(sizeof(Node));
v->data=malloc(sizeof(NodeData));
memcpy(v->data, old->data,
sizeof(NodeData));
free(old->data); BDW
v->next = old->next; Collector
v->next->prev = v;
v->prev = old->prev;
v->prev->next = v;
free(old);
Using GC in C/C++ is easy:
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
47. Comparing Memory Managers
Node v = malloc(sizeof(Node));
v->data=malloc(sizeof(NodeData));
memcpy(v->data, old->data,
sizeof(NodeData));
free(old->data); BDW
v->next = old->next; Collector
v->next->prev = v;
v->prev = old->prev;
v->prev->next = v;
free(old);
…slide in BDW and ignore calls to free.
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
48. What About Other Garbage Collectors?
Compares malloc to GC, but only
conservative, non-copying collectors
Can’t reduce fragmentation,
reorder objects, etc.
But: faster precise, copying collectors
Incompatible with C/C++
Standard for Java…
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
49. Comparing Memory Managers
Node node = new Node();
node.data = new NodeData();
useNode(node);
node = null;
... Lea
node = new Node(); Allocator
...
node.data = new NodeData();
...
Adding malloc/free to Java:
not so easy…
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
50. Comparing Memory Managers
Node node = new Node();
node.data = new NodeData();
useNode(node); free(node)
node = null; ?
... Lea
node = new Node(); Allocator
... free(node.data)?
node.data = new NodeData();
...
... need to insert frees, but where?
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
51. Oracular Memory Manager
Java C malloc/free execute program
here
perform actions
allocation at no cost
Simulator below here
Oracle
Consult oracle at each allocation
Oracle does not disrupt hardware state
Simulator invokes free()…
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
52. Object Lifetime & Oracle Placement
obj =
new Object; freed be freed
can by
lifetime-
live based oracle dead
reachable freed by
unreachable
reachability-
free(obj)
free(obj)oracle
based free(??)can be
collected
Oracles bracket placement of frees
Lifetime-based: most aggressive
Reachability-based: most conservative
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
53. Liveness Oracle Generation
C
Java execute program
malloc/free here
perform actions
allocation, PowerPC at no cost
mem access, below here
prog. roots Simulator
trace Post-
process Oracle
file
Liveness: record allocs, memory accesses
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
54. Reachability Oracle Generation
C execute program
Java
malloc/free here
perform actions
allocations, PowerPC at no cost
ptr updates, below here
prog. roots Simulator
trace Merlin
analysis Oracle
file
Reachability:
Illegal instructions mark heap events
(especially pointer assignments)
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
55. Oracular Memory Manager
C execute program
Java
malloc/free here
perform actions
PowerPC at no cost
allocation below here
Simulator
oracle
Run & consult oracle before each allocation
When needed, modify instruction to call free
Extra costs (oracle access) hidden by simulator
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
56. Execution Time for pseudoJBB
150%
GenMS
GenCopy
140% GenRC
Lea w/ Reach
Lea w/ Life
130%
Time Relative to Lea
120%
110%
100%
90%
1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 3.75 4.00
Heap Size Relative to Collector Minimum
GC can be faster than malloc/free
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
57. Geo. Mean of Execution Time
130%
GenMS
GenCopy
GenRC
125% Lea w/ Reac h
Lea w/ Life
MSExplic it w/ Reac h
120%
Execution Time Relative to Lea
1 5%
1
1 0%
1
105%
100%
95%
90%
1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 3.75 4.00
Heap Size Relative to Collector Minimum
Trades space for time
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
58. Footprint at Quickest Run
GC uses much more memory for speed
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
59. Footprint at Quickest Run
7.69
7.09
5.66
5.10
4.84
1.38 1.61
1.00
0.63
GC uses much more memory for speed
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
60. Javac Paging Performance
GC: poor paging performance
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
61. Summary of Results
Best collector equals Lea's performance…
Up to 10% faster on some benchmarks
... but uses more memory
Quickest runs require 5x or more memory
GenMS at least doubles mean footprint
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
62. When to Use Garbage Collection
Garbage collection fine if
system has more than 3x needed RAM
and no competition with other processes
or avoiding bugs / security more important
Not so good:
Limited RAM
Competition for physical memory
Depends on RAM for performance
In-memory database
Search engines, etc.
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
63. The End
UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 63