Caching 101: Caching on the JVM (and beyond)

@alexsnaps @ljacomet#Devoxx #cache101
Caching 101
Caching on the JVM  
and beyond
alex snaps - principal software engineer at terracotta
louis jacomet - lead software engineer at terracotta

Agenda
•Theory
•Caching patterns
•Scaling

Introductory Poll
•Who knows nothing about caching?
•Who already uses caching in production?
•Who had caching related problems in
production?
•Who is interested in advanced caching patterns?

(Caching) theoryhttp://blog.spylight.com/sheldons-big-bang-theory-style/

There are only two hard things
in Computer Science:  
cache invalidation and naming
things.
-- Phil Karlton

CPU Caches
https://en.wikipedia.org/wiki/Non-uniform_memory_access

Cache coherence
• P write A to X, P read X => returns A when no writes
happened in between, must be valid for single processor too
• P2 writes A to X, P1 reads X => returns A if “enough time” has
elapsed
• P1 writes A to X, P2 writes B to X => no one can ever see B
then A at X

Why?
http://www.eecs.berkeley.edu/~rcs/research/interactive_latency.html

Numbers explained
L1 cache reference 1 s 2 heartbeats
Branch mispredict 3 s Yawn
L2 cache reference 4 s (Longer) Yawn
Mutex lock / unlock 17 s Coffee preparation
Main memory reference 100 s Brushing your teeth
Send 2k over commodity network 4.2 s A song

Numbers explained
Compress 1kB with zippy 33 m Sitcom episode
Read 1MB from memory 2 h Bad commute
SSD random read 4 h Half day at work
Read 1MB from SSD 2.6 d Long week-end
Round trip in datacenter 10 d Two weeks delivery
Packet roundtrip CA to Be 4.8 y PhD thesis

http://yourstory.com/2015/05/performance-matrix/

Amdhal’s law
•Parallelisation
•P: parallel proportion
•N: processor count

Amdhal’s law (cont’d)
•Sequential
•p: times a part was sped up
•f: fraction of time not
improved

Further reading
• Gustafson’s law states that
computations involving
arbitrarily large data sets
can be efﬁciently
parallelized.
• As computing powers
grow, larger problems can
be solved in a given time

Where and when to use a cache?
https://www.ﬂickr.com/photos/wwarby/
It depends …
Measure, do not guess!

What is a cache in an application?
• Data structure holding a temporary copy of some data
• Trade off between higher memory usage for reduced latency
• Targets:
• Data which is reused
• Data which is expensive to compute or retrieve

Desired cache features
• Capacity control / eviction
• Data freshness / expiry
• Data consistency / invalidation
• Fault tolerant

JSR 107
• Java Community Process driven standard
• Speciﬁes API and semantics for temporary,  
in-memory caching of Java objects, including object creation,
shared access, spooling, invalidation, and consistency across JVM's

javax.cache API
• CacheManagers managing … Caches !
• Expiry
• Integration
• Cache Entry Listeners
• Entry Processors
• Caching Annotations
• Some Management

CacheManager
• Caching.getCachingProvider() static method
•CachingProvider.getCacheManager()
• CachingProvider
• CacheManager repository
• Identiﬁed by name & ClassLoader

CacheManager
•CacheManager
• Manages Cache lifecycle:
• Create & Destroy
• Repository of (named) Cache instances
• Manages Cache JMX (stats & management)
• … is .unwrap()’ able

Cache aside - miss
Application
Cache
Database

Cache aside - hit
Application
Cache
Database

Demo

Cache aside conclusions
• Solution most seen out there: Spring, Play, Grails, …
• Most often based on annotations
• Tricky to get the concurrency and / or atomicity right
• Especially when rolling your own
• Does not resolve doing multiple real invocations when warming cache

Hibernate
boat1 = session.get( Boat.class, id );
boat2 = session.get( Boat.class, id );
boat1 == boat2 ?

Hibernate 2nd level cache
• Stores “dehydrated” entity states
• hibernate.cache.use_structured_entries
• 4 strategies
• ReadOnly
• ReadWrite
• NonStrictReadWrite
• Transactional

Cache through - miss
Application Cache
Database

Cache through - hit
Application Cache
Database

Cache through - write
Application Cache
Database

Cache through conclusions
• Requires different abstraction / modelling
• Viewing the system of record through the cache may not be easy
• Provides better guarantees and consistency as invalidation is no
longer required
• Still falls apart if data is modiﬁed by other applications / processes
• Cost of writing is paid by thread putting in the cache

Application Cache
Database
queue
Write behind

Application Cache
Database
queue
Write behind - eviction
(K1,V1)
(K1,V1)

Write behind - failure
Application Cache
queue
Database

Write behind - failure
queue
Database

Write behind conclusions
• Scales out your writes
• Batching and coalescing
• Persistent queue or not
• Idempotent operations are important in distributed and failure
conditions

Refresh ahead
Application Cache
Database

Scheduled refresh
Application Cache
Database

Scaling your cache

Moving off heap
• JVMs have issues with large heaps due to garbage collection
• OffHeap is a nice alternative for data which has a well known
lifecycle
• Perfect match for cache data
• Performance hit due to binary representation compared to
objects on heap

Want to know
more?
Talk on Thursday at 5.50pm room 10
by Chris Dennis (Terracotta)

Clustering
• Cache shared between multiple machines
• Different topologies
• Peer to peer
• Client server - with or without client cache
• Consistency becomes a much harder problem

Terracotta clustering

Strong consistency
Cache
Terracotta
client
Terracotta
server
Terracotta
client
Terracotta
client

Eventual consistency
Cache
Terracotta
client
Terracotta
server
Terracotta
client
Terracotta
client

Probabilistically Bounded Staleness
• In how “long” will a write be “readable” by all ?
• How many “old versions” are still around ?
• Depending on
• Network delay
• Node processing time
• … delayed replication  
(e.g. batching)
• LinkedIn’s data stores returned consistent data 99.9 percent of the time
within 13.6 ms, and on SSDs (solid-state drives) within 1.63 ms.

Conclusion
• You know it all now!
• Don’t be afraid of caching
• Consider it early on
• Topologies!
• Thanks! Questions?

Caching 101: Caching on the JVM (and beyond)

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Caching 101: Caching on the JVM (and beyond)

Similar to Caching 101: Caching on the JVM (and beyond) (20)

Recently uploaded

Recently uploaded (20)

Caching 101: Caching on the JVM (and beyond)