Terracotta's OffHeap Explained

@tall_chris#Devoxx #TCoffheap
Terracotta’s OffHeap Explained
Chris Dennis
Terracotta (aka Software AG)

Who Am I?
• Trained as a Physicist, clearly not trained as a Computer
Scientist.
• 4Years Doing Unnatural Things With Bytecode In Academia
• 3Years Doing Unnatural Things With Bytecode For Money
• 4Years Doing Unnatural Things With ByteBuffers
• 11Years Doing Java Development
• Software Engineer working at Terracotta (Software AG)

[dungeon@Main1 ~]$ cat /proc/meminfo
MemTotal: 6354030896 kB
MemFree: 112170556 kB
[dungeon@Main1 ~]$ cat /proc/cpuinfo
processor : 119
vendor_id : GenuineIntel
cpu family : 6
model : 62
model name : Intel(R) Xeon(R) CPU E7-4890 v2 @ 2.80GHz
stepping : 7
cpu MHz : 1200.000
cache size : 38400 KB
physical id : 3
I Get To Play With Big Toys

A Bit of History
2010 Started development as a caching ‘tier’ within Ehcache.
2011 Integrated as a caching tier in front of Oracle BDB in the
Terracotta Server.
2013 Legal complications push it in to service as the primary
storage for the Terracotta Server.
2015 Open Sourced (https://github.com/Terracotta-OSS/
offheap-store).

Problem Statement
• Map: collection of key-value pairs
• Cache ≈ a Map with bells on
• Caching is good:
https://xkcd.com/908/

Problem Statement
• “a lot of caching” leads to
• a lot of heap, which leads to,
• a lot of work for the garbage collector, which leads to,
• a lot of GC pausing/overhead”
• The situation is markedly better now than when the bulk of this
library was written. (Please don’t tell my employer I said that)

Map/Cache Best Practices
• Immutable Keys
• please do this!
• ImmutableValues
• please do this!
• So with immutability everywhere, who cares about object
identity?
• If I don’t need object identity, do I need a heap?
• If I don’t need a heap, do I need a garbage collector?

Solution
• Replace heavy (large) map/cache usage with an ‘outside the
heap’ but ‘inside the process’ implementation.
• Beneﬁts at two scales:
• At moderate scale, the GC ofﬂoad reduces overheads.
• At large scale, we can still function: -Xmx6T
• Caveats
• Marshalling/unmarshalling costs time (and CPU)
• Trading away average latency to control the tail.

Replace What?
java.util
(Hash)Map
java.util.concurrent
Concurrent(Hash)Map
Java Heap
Garbage
Collector
Class Layout
Logic

Maps
java.util
(Hash)Map
Concurrent(Hash)Map
Java Heap
Garbage
Collector
Class Layout
Logic

JDK HashMap

JDK HashMap
0 1 2 3 4 5 6 7

JDK HashMap
0 1 2 3 4 5 6 7
put(k1, v)

JDK HashMap
0 1 2 3 4 5 6 7
put(k1, v)
k1, v

JDK HashMap
0 1 2 3 4 5 6 7
k1, v

JDK HashMap
0 1 2 3 4 5 6 7
k1, v
put(k2, v)

JDK HashMap
0 1 2 3 4 5 6 7
k1, v k2, v
put(k2, v)

JDK HashMap
0 1 2 3 4 5 6 7
k1, v k2, v

JDK HashMap
0 1 2 3 4 5 6 7
k1, v k2, v
put(k3, v)

JDK HashMap
0 1 2 3 4 5 6 7
k1, v k2, v
k3, v
put(k3, v)

JDK HashMap
0 1 2 3 4 5 6 7
k1, v k2, v
k3, v

OffHeap Map
0 1 2 3 4 5 6 7

OffHeap Map
0 1 2 3 4 5 6 7
put(k1, v)

OffHeap Map
0 1 2 3 4 5 6 7
put(k1, v)
k1, v

OffHeap Map
0 1 2 3 4 5 6 7
k1, v

OffHeap Map
0 1 2 3 4 5 6 7
k1, v
put(k2, v)

OffHeap Map
0 1 2 3 4 5 6 7
k1, v k2, v
put(k2, v)

OffHeap Map
0 1 2 3 4 5 6 7
k1, v k2, v

OffHeap Map
0 1 2 3 4 5 6 7
k1, v k2, v
put(k3, v)

OffHeap Map
0 1 2 3 4 5 6 7
k1, v k2, vk3, v
put(k3, v)

OffHeap Map
0 1 2 3 4 5 6 7
k1, v k2, vk3, v

OffHeap Map
0 1 2 3 4 5 6 7
k1, v k2, vk3, v
• Hash Map
• Open Addressing
• Linear Reprobe (1 slot)

class Node<K, V> {
final int hash;
final K key;
V value;
Node<K, V> next;
}
JDK HashMap
k1, v
primitive - easy to store
heap references
closed addressing speciﬁc

‘struct’ slot {
int status
int hash;
long encoding
}
OffHeap Map
k1, v
primitive - easy to store
encoded key/value pair

interface StorageEngine<K, V> {
Long writeMapping(K key, V value, int hash, int metadata);
void freeMapping(long encoding, int hash, boolean removal);

V readValue(long encoding);
boolean equalsValue(Object value, long encoding);

K readKey(long encoding, int hashCode);

boolean equalsKey(Object key, long encoding);
}
Storing Key & Values

Options with 64 bits available
• 64 bit combined pointer
• 32 bit key pointer & 32 bit value pointer
• int key directly + 32 bit pointer
• long key directly + 32 bit pointer
• …anything else you like

Pointer to What?
java.util
(Hash)Map
Concurrent(Hash)Map
Java Heap
Garbage
Collector
Class Layout
Logic

A Native ‘Heap’
byte addressable memory (logical address space)
0 max
page page page page
ByteBuffer
.slice()
ByteBuffer
.slice()
ByteBuffer
.slice()
ByteBuffer
.slice()
ByteBuffer.allocateDirect() (physical address space)

Managing The ‘Heap’
java.util
(Hash)Map
Concurrent(Hash)Map
Java Heap
Garbage
Collector
Class Layout
Logic

A Native Heap Allocator
• malloc/free performed using a Java port of dlmalloc
• http://g.oswego.edu/dl/html/malloc.html
• Works well for our use cases as we do not generally control
or even know the malloc size distribution.

Marshaling
java.util
(Hash)Map
Concurrent(Hash)Map
Java Heap
Garbage
Collector
Class Layout
Logic

“Java Serialization Sucks”
• Serialization is self describing.
• It supports
• object identity
• cycles
• complex versioning
• Pretty heavyweight, especially for short streams…
• …but it’s the default serialization mechanism available in
Ehcache 2.x

“Java Serialization Sucks”
• serialize(new Integer(42))
• results in these 81 bytes:
0 1 2 3 4 5 6 7 8 9 A B C D E F
0 AC ED 00 05 73 72 00 11 6A 61 76 61 2E 6C 61 6E
1 67 2E 49 6E 74 65 67 65 72 12 E2 A0 A4 F7 81 87
2 38 02 00 01 49 00 05 76 61 6C 75 65 78 72 00 10
3 6A 61 76 61 2E 6C 61 6E 67 2E 4E 75 6D 62 65 72
4 86 AC 95 1D 0B 94 E0 8B 02 00 00 78 70 00 00 00
5 2A

OffHeap’s Serialization Sucks Less?
• serialize(new Integer(42))
• results in 22 bytes
0 1 2 3 4 5 6 7 8 9 A B C D E F
0 AC ED 00 05 73 72 00 00 00 00 78 72 00 00 00 01
1 78 70 00 00 00 2A
2
3
4
5

With some structure
STREAM_MAGIC STREAM_VERSION
TC_OBJECT
TC_CLASSDESC utf(17, java.lang.Integer)
serialVersionUID[12E2A0A4F7818738] SC_SERIALIZABLE
fields=[I:utf(5, value)]
TC_END_BLOCKDATA
TC_CLASSDESC utf(16, java.lang.Number)
serialVersionUID[86AC951D0B94E08B] SC_SERIALIZABLE
fields=[]
TC_END_BLOCKDATA
TC_NULL
0000002A

With some structure
STREAM_MAGIC STREAM_VERSION
TC_OBJECT
TC_CLASSDESC descriptor(0)
TC_END_BLOCKDATA
TC_CLASSDESC descriptor(1)
TC_END_BLOCKDATA
TC_NULL
0000002A

Where did the 59 bytes go?
• How many types are in my map?
• All keys the same type: really common
• All values the same type: fairly common
• Stick those common ObjectStreamClass instances in a look
aside structure
• Map<Integer, ObjectStreamClass> for reading streams
• Map<SerializableDataKey, Integer> for writing streams

class ObjectOutputStream {
protected void writeClassDescriptor(ObjectStreamClass desc);
}
class ObjectInputStream {
protected ObjectStreamClass readClassDescriptor();
}
Serialization is pretty malleable

Portability
• But if serialization still sucks…
interface Portability<T> {
ByteBuffer encode(T object);
T decode(ByteBuffer buffer);
boolean equals(Object object, ByteBuffer buffer);
}

Concurrency
java.util
(Hash)Map
Concurrent(Hash)Map
Java Heap
Garbage
Collector
Class Layout
Logic

j.u.c.ConcurrentMap
• What does a concurrent map provide?
• happens-before relationship: “actions in a thread prior to placing an
object into a ConcurrentMap as a key or value happen-before actions
subsequent to the access or removal of that object from the
ConcurrentMap in another thread”
• atomic operations: “…except that the action is performed atomically.”
• What do we want?
• concurrent access (readers and writers)

Happens Before Relationships
• volatile write/read
• but not on offheap memory locations
• synchronized
• needs a heap object
• other j.u.c classes (Lock,Atomic…)
• needs a heap object
• There is no way within the JDK to enforce a happens before
relationship between writes/reads of an offheap location…

No Unsafe please, we’re a library
• Our testing has never shown our offheap implementation to
be a bottleneck in our usages.
• Unnecessary complexity costs $$$
• support
• maintenance
• bugs

Simple solution:
OffHeapMap offheap memory area
dlmalloc serializer

ReadWriteLock
Simple solution:
OffHeapMap
offheap memory area
dlmalloc
serializer
ConcurrentOffHeapMap

A ‘Concurrent’ Map
✅ happens-before relationship: “actions in a thread prior to
placing an object into a ConcurrentMap as a key or value
happen-before actions subsequent to the access or removal of
that object from the ConcurrentMap in another thread”
✅ atomic operations: “…except that the action is performed
atomically.”
⚠ concurrent access (readers and writers)

Moar Write Concurrency!

ReadWriteLock
OffHeapMap
offheap memory area
dlmalloc serializer
ReadWriteLock
OffHeapMap
offheap memory area
dlmalloc serializer
StripingLogic

ReadWriteLock
OffHeapMap
offheap memory area
dlmalloc serializer
ReadWriteLock
OffHeapMap
offheap memory area
dlmalloc serializer
StripingLogic
put(k1, v)

ReadWriteLock
OffHeapMap
offheap memory area
dlmalloc serializer
ReadWriteLock
OffHeapMap
offheap memory area
dlmalloc serializer
StripingLogic
put(k1, v)
k1, v

ReadWriteLock
OffHeapMap
offheap memory area
dlmalloc serializer
ReadWriteLock
OffHeapMap
offheap memory area
dlmalloc serializer
StripingLogic
k1, v

ReadWriteLock
OffHeapMap
offheap memory area
dlmalloc serializer
ReadWriteLock
OffHeapMap
offheap memory area
dlmalloc serializer
StripingLogic
put(k2, v)
k1, v
k2, v

ReadWriteLock
OffHeapMap
offheap memory area
dlmalloc serializer
ReadWriteLock
OffHeapMap
offheap memory area
dlmalloc serializer
StripingLogic
k1, v
k2, v

ReadWriteLock
OffHeapMap
offheap memory area
dlmalloc serializer
ReadWriteLock
OffHeapMap
offheap memory area
dlmalloc serializer
StripingLogic
put(k3, v)
k1, v k3, v
k2, v

ReadWriteLock
OffHeapMap
offheap memory area
dlmalloc serializer
ReadWriteLock
OffHeapMap
offheap memory area
dlmalloc serializer
StripingLogic
k1, v k3, v
k2, v

Conclusions
1. Simple engineering is simpler to support and maintain.
2. Going off-heap doesn’t require Unsafe
• (unless ultimate performance is your primary concern)

Additional Topics
• Caching
• Weakly-consistent Iterators
• Cross Segment Eviction
• Page Stealing Algorithms
• Native Heap Compaction
• Map Rehashing (Growing &
Shrinking)
• Off-Memory (SSDs)
• Persistence/Durability
• Entry Level Pinning
• Probably Other Stuff I
Forgot About…

Questions?
(BTW We’re Hiring)
https://github.com/Terracotta-OSS/offheap-store/

Terracotta's OffHeap Explained

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Terracotta's OffHeap Explained

Similaire à Terracotta's OffHeap Explained (20)

Dernier

Dernier (20)

Terracotta's OffHeap Explained