1. Chicago Java User Group Meetup: Lightning Talks
January 14, 2016
Some Garbage Talk…..
Finding a Suitable Garbage Collector for OpenTSDB
Presented by: Jayesh Thakrar
jthakrar@conversantmedia.com
2. What Does Conversant do? (www.conversantmedia.com)
• Uses programmatic advertising for personalized messaging on the
internet across browsers and devices (phones, tablets, etc.)
• Facilitates targeted, measurable audience campaigning for
customers with demonstrable effectiveness
• Links in-store (offline) and online activity of anonymized
individuals and evaluates messaging effectiveness
My Role
Sr. Software and Data Engineer - get to build, play, tinker, tweak and
manage big data toys (data systems and pipelines)
2
3. 3
HA Proxy
Load
Balancer
TSDB Daemon
+
HBase + Hadoop
Application
Services
Application
Services
Application
Services
• OpenTSDB = Timeseries datastore
• No caching within TSDB daemons
• 12 OpenTSDB servers
each with TSDB + HBase + Hadoop
• 2.5 years data retention
• Automatic data purge via
HBase column family TTL setting
What, Why and How of OpenTSDB
4. 4
HA Proxy
Load
Balancer
TSDB Daemon
+
HBase + Hadoop
Application
Services
Application
Services
Application
Services
• OpenTSDB = Timeseries datastore
• No caching within TSDB daemons
• 12 OpenTSDB servers
each with TSDB + HBase + Hadoop
• 2.5 years data retention
• Automatic data purge via
HBase column family TTL setting
• Metrics from 1200+
application services
across US and Europe
• 550+ million metric data
points created daily
• 20-30 concurrent users
What, Why and How of OpenTSDB
5. Problem: Long GC pauses in OpenTSD daemons
causing user annoyances and often long pauses
Java Version : java version "1.7.0_40"
Java(TM) SE Runtime Environment (build 1.7.0_40-b43)
Java HotSpot(TM) 64-Bit Server VM (build 24.0-b56, mixed mode)
Initial Tuning: Increasing heap from 6 GB to 12 GB in increments of 2 GB
significantly reduced long GC pauses
Improvement "good enough", but continued further to better understand
the interaction of the various GC types and OpenTSDB characteristics….
5
How It All Began…..
6. All collectors below are "generational", i.e. heap memory has areas for young and old objects
Young generation area = Eden space (new objects since last GC) + Survivor 0 (from) + Survivor 1 (to)
Old generation area = Contains objects that have survived a number of GC cycles
Parallel GC: - Young generation: stop-the-world parallel threads
- Old generation: stop-the-world serial mark-sweep-compact of old gen
- Performs compaction
- ParallelOldGC (-XX:+UseParallelOldGC) for parallel old generation
Concurrent Mark-Sweep (CMS) GC: - Young generation: same as parallel GC
- Old generation: Mix of stop-the-world and concurrent steps
- No compaction and occasional stop-the-world full gc of heap
G1 (Garbage-First) GC: Young generation: parallel, stop the world
- Old generation: Similar to young generation + snapshot-based marking
- Dynamic old and young area sizes, performs compaction
- Better young generation pointer/reference management
- Supposedly better "goal management" - gc pause or throughput 6
Tested 3 Garbage Collector Types : Parallel, CMS, G1
7. Deciding Metrics: GC events - count, max/avg time, total time (clock or real time)
Tools Used: jmap -heap <pid>
jstat -gcutil <pid> OR jstat -gccause <pid>
jstat -gcutil -t <pid> <interval duration in ms> <no. of durations>
jconsole?
How to Set Each Collector: Parallel GC = no flag required (or -XX:+UseParallelOldGC)
CMS GC = -XX:+UseConcMarkSweepGC
G1 GC = -XX:+UseG1GC
java -verbose:gc <flag-to-set-specific-GC> -XX:+PrintGCDetails -XX:+PrintGCTimeStamps
-XX:+PrintGCDateStamps -XX:+UnlockExperimentalVMOptions -Xloggc:/opt/logs/opentsdb/gc.log
-XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=10M
Approach: Run OpenTSDB daemon with each GC type
and examine jmap, jstat and gc log output 7
Garbage Collector Shootout
8. $ jmap -heap 23181
Attaching to process ID 23181, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 24.0-b56
using thread-local object allocation.
Parallel GC with 10 thread(s)
…..
8
Parallel Collector: jmap Output
Heap Usage:
PS Young Generation
Eden Space:
capacity = 3668967424 (3499.0MB)
used = 1571211720 (1498.4242630004883MB)
free = 2097755704 (2000.5757369995117MB)
42.82435733067986% used
From Space:
capacity = 312999936 (298.5MB)
used = 89260032 (85.125MB)
free = 223739904 (213.375MB)
28.517587939698494% used
To Space:
capacity = 296222720 (282.5MB)
used = 0 (0.0MB)
free = 296222720 (282.5MB)
0.0% used
PS Old Generation
capacity = 8589934592 (8192.0MB)
used = 3749497432 (3575.79940032959MB)
free = 4840437160 (4616.20059967041MB)
43.64989502355456% used
9. $ jmap -heap 3061
Attaching to process ID 3061, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 24.0-b56
using parallel threads in the new generation.
using thread-local object allocation.
Concurrent Mark-Sweep GC
…….
9
CMS Collector: jmap Output
Heap Usage:
New Generation (Eden + 1 Survivor Space):
capacity = 785186816 (748.8125MB)
used = 269009024 (256.5469970703125MB)
free = 516177792 (492.2655029296875MB)
34.260512086845836% used
Eden Space:
capacity = 697958400 (665.625MB)
used = 181780608 (173.3594970703125MB)
free = 516177792 (492.2655029296875MB)
26.044619278169016% used
From Space:
capacity = 87228416 (83.1875MB)
used = 87228416 (83.1875MB)
free = 0 (0.0MB)
100.0% used
To Space:
capacity = 87228416 (83.1875MB)
used = 0 (0.0MB)
free = 87228416 (83.1875MB)
0.0% used
concurrent mark-sweep generation:
capacity = 12012486656 (11456.0MB)
used = 6924832160 (6604.034576416016MB)
free = 5087654496 (4851.965423583984MB)
57.64694986396662% used
10. $ jmap -heap 13183
Attaching to process ID 13183, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 24.0-b56
using thread-local object allocation.
Garbage-First (G1) GC with 10 thread(s)
…..
10
G1 Collector: jmap Output
Heap Usage:
G1 Heap:
regions = 3072
capacity = 12884901888 (12288.0MB)
used = 9139494440 (8716.101112365723MB)
free = 3745407448 (3571.8988876342773MB)
70.9318124378721% used
G1 Young Generation:
Eden Space:
regions = 1127
capacity = 7407140864 (7064.0MB)
used = 4726980608 (4508.0MB)
free = 2680160256 (2556.0MB)
63.81653454133635% used
Survivor Space:
regions = 6
capacity = 25165824 (24.0MB)
used = 25165824 (24.0MB)
free = 0 (0.0MB)
100.0% used
G1 Old Generation:
regions = 1049
capacity = 5452595200 (5200.0MB)
used = 4387348008 (4184.101112365723MB)
free = 1065247192 (1015.8988876342773MB)
80.46348293011005% used
11. Parallel Collector
$ jstat -gcutil -t 23181
Timestamp S0 S1 E O P YGC YGCT FGC FGCT GCT
40759.5 0.00 2.50 77.07 69.04 50.09 475 44.447 5 1.278 45.725
CMS Collector
$ jstat -gcutil -t 3061
Timestamp S0 S1 E O P YGC YGCT FGC FGCT GCT
41819.4 0.00 100.00 63.34 47.13 59.70 2771 133.708 13 4.700 138.407
G1 Collector
$ jstat -gcutil -t 13183
Timestamp S0 S1 E O P YGC YGCT FGC FGCT GCT
41762.6 0.00 100.00 15.22 80.44 72.74 396 40.286 0 0.000 40.286
11
jstat Output
Key Points:
• S0/S1 = Survivor Space 0/1
• E/O = Eden Space / Old Gen Space
• YGC = Young Garbage Collection
• FGC = Full Garbage Collection
• YGCT/FGCT = YGC/FGC Time
• GCT = Cumulative GC Time
• Compare:
• YGC and FGC count
(YGC & FGC)
• Total GC time
(YGCT, FGCT, GCT)
• Avg. YGC and FGC time
(YGCT/YGC and FGCT/FGC)
• Max GC pause time
need to examine gc log output
12. Why is G1GC Better?
• TSDB has a lot of "object churn" due to traffic activity
(see HAProxy stats below)
• Most of the objects are short lived
• In all the collectors, young gen collections are more efficient
So the more churn data that can fit in eden, better/faster is the gc event
12
Conclusion
Incoming Metrics TrafficUI Traffic
15. • GC is unavoidable - it’s a fact of a life
• Not memory constrained? First make "reasonable" increases to heap size
• Use identical values for max/min heap sizes to reduce memory resizing
• Memory constrained? Focus on sizing heap, new and old generations
• CPU constrained? Focus on reducing total GC time
• Latency sensitivity (gc pauses)? Focus on reducing max/avg GC pause times
• Understand gc causes and time spend across different activities of gc
• Understand memory churn - "steady-state size" and "live size"
• Good tunables: -XX:MaxGCPauseMillis -XX:InitiatingHeapOccupancyPercent15
What About GC Tuning?