SF Java presentation of jvm goes to big data.
“Slowly yet surely the JVM is going to Big Data! In this fun filled presentation we see what pieces of Java & JVM triumph or unravel in the battle for performance at high scale!”
Concurrency is the currency of scale on multi-core & the new generation of collections and non-blocking hashmaps are well worth the time taking a deep dive into. We take a quick look at the next gen serialization techniques as well as implementation pitfalls around UUID. The achilles' heel for JVM remains Garbage Collection: a deep dive into the internals of the memory model, common GC algorithms and their tuning knobs is always a big draw. EC2 & cloud present us with a virtualized & unchartered territory for scaling the JVM.
We will leave some room for Q&A or fill it up with any asynchronous I/O that might queue up during the talk. A round of applause will be due to the various tools that are essentials for Java performance debugging.
4. tools of trade
• What the JVM is doing:
– dtrace, hprof, introscope, jconsole, visualvm, yourkit,
gchisto, zvision
• Invasive JVM observation tools:
– bci, jvmti, jvmdi/pi agents, logging
• What the OS is doing:
– dtrace, oprofile, vtune, perf
• What the network/disk is doing:
– ganglia, iostat, lsof, nagios, netstat, tcpdump
5.
6. synchronized
under the hood
– Fast path for nocontention thin lock
– Bias threads to lock or bulk revoke bias
– Store free biasing
11. Nonblocking collections:
Amdahl's > Moore's!
State, Actions – key/value pairs!
get, put, delete, _resize
ByteArray to hold Data
Concurrent writes: using CAS
No locks, no volatile
Much faster than locking under heavy load
Directly reach main data array in 1 step
Resize as needed
Copy Array to a larger Array on demand. Post updates
12. Death & Taxes: Java Overheads!
• Cost of an 8char String?
8b 12b 4b
hdr fields ptr
A: 56 bytes, or a 7x blowup
8b 4b 16b 4b
hdr len data pad
• Cost of 100entry TreeMap<Double,Double> ?
48b
TreeMap
40b
TreeMap$Entry
16b 16b A: 7248 bytes or a ~5x blowup
Double Double
15. serializable
java.io.Serializable is S.L..O.…W
True to platform
Use “transient”
ObjectSerialField[]
Avro
Google Protocol Buffers,
Externalizable + byte[]
Roll your own
17. avro
• Schema
– No per datum overheads
– Optional code gen
• Types are runtime
• Untagged data
• No manuallyassigned field Ids
Cons:
• Schema mismatches
• Runtime only checks
20. UUID
java.util.UUID is slow
●
dominated by sha_transform costs
●
Leachsalz (128bit)
Turns out that default PRNG (via SecureRandom)
Uses /dev/urandom for seed initialization
Djava.security.egd=file:/dev/urandom
●
PRNG without file is atleast 20%40% better.
Use TimeUUIDs where possible – much faster
Alternatives: JUG – java.uuid.generator, com.eaio.uuid
~10x faster
http://github.com/cowtowncoder/javauuidgenerator
http://jug.safehaus.org/
http://johannburkard.de/blog/programming/java/JavaUUIDgeneratorscompared.htm
44. Gone 0xff the heap ?
Issues to consider:
No clear api to deallocate from this region
●
See jbellis patch to JNA179 for FreeableBuffer
Object cleanup relegated to finalization
Single finalizer thread, Bug ID: 4469299
Behind WeakReference processing in jdk16u21
Workaround:
XX:MaxDirectMemorySize=<size>
Manually Trigger System.gc() to avoid “leak”
46. summary
• JVM is still the most popular platform for
deployment for the new languages!
• JVM heartburn around scale!
– Serialization
– UUID
– Object overhead
– Garbage Collection
– Hypervisor