Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Troubleshooting Java HotSpot VM

1 944 vues

Publié le

While working with Java applications running on the Java HotSpot VM, we might sometimes encounter problems such as application hangs, memory leaks, unexpected application behavior, or crashes. Troubleshooting such problems can be very hard and tricky. But with knowledge of the right set of tools and utilities for nailing these problems down and how to approach them, troubleshooting can be made much easier and can help us develop stable, reliable, and efficient Java applications. This slides deck covers how we should approach these JVM issues and which tools and utilities are useful for diagnosing and troubleshooting them.

Publié dans : Technologie
  • Login to see the comments

Troubleshooting Java HotSpot VM

  1. 1. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Troubleshooting Java HotSpot VM Poonam Parhar Consulting Member of Technical Staff JVM Sustaining Engineering, Oracle Sept 22, 2016
  2. 2. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle. 4
  3. 3. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Troubleshooting • Systematic approach to solving problems • Troubleshooting is partly science and partly an art • Curiosity, focus, attention to the details, persistence, patience and experience play a great role • Good troubleshooting tools are your friends • Three simple steps: 1. Understand the problem/error 2. Collect the required diagnostic data 3. Analyze the collected data 5
  4. 4. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Troubleshooting Java HotSpot Virtual Machine 6
  5. 5. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | HotSpot JVM issues • OutOfMemory errors / Memory leaks • Latencies or Pauses • JVM Hangs / Stuck threads • High CPU usage • Application Crashes • Java Exceptions 7
  6. 6. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Agenda Look at the HotSpot JVM issues one by one How to approach them Troubleshooting and Debugging tools Best Practices for Troubleshooting HotSpot JVM Q&A 1 2 3 4 8 5
  7. 7. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Memory Problems 9
  8. 8. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 10 Memory Problems: understand Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOfRange(Unknown Source) at java.lang.String.<init>(Unknown Source) at java.io.BufferedReader.readLine(Unknown Source) at java.io.BufferedReader.readLine(Unknown Source) at org.girs.TopicParser.dump(TopicParser.java:23) at org.girs.TopicParser.main(TopicParser.java:59) # # A fatal error has been detected by the Java Runtime Environment: # # java.lang.OutOfMemoryError: requested 32756 bytes for ChunkPool::allocate. Out of swap space? # # Internal Error (allocation.cpp:166), pid=2290, tid=27 # Error: ChunkPool::allocate #
  9. 9. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | OutOfMemoryError: Java Heap Space • Ensure that footprint is not larger than the maximum specified heap size • Look at GC logs to determine stable footprint 688995.775: [Full GC [PSYoungGen: 46400K->0K(471552K)] [ParOldGen: 1002121K- >304673K(1036288K)] 1048521K->304673K(1507840K) [PSPermGen: 253230K- >253230K(1048576K)], 0.3402350 secs] [Times: user=1.48 sys=0.00, real=0.34 secs] • Increase the heap size using –Xmx JVM option May indicate that there is a memory leak Watch if live-set is increasing over time 11
  10. 10. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Java Memory Leak • Java Flight Recorder – with Heap Statistics Enabled • Heap Dumps come to rescue – jcmd <process id/main class> GC.heap_dump filename=Myheapdump – jmap -dump:format=b,file=snapshot.jmap pid – JConsole utility, using Mbean HotSpotDiagnostic • -XX:+HeapDumpOnOutOfMemoryError • Heap Histogram – -XX:+PrintClassHistogram and Control+Break – jcmd <process id/main class> GC.class_histogram filename=Myheaphistogram – jmap -histo pid – jmap -histo <java> core_file • Monitor Objects Pending Finalization – JConsole – jmap – finalizerinfo 12 collect diagnostic data
  11. 11. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Java Memory Leak • Flight Recordings with Java Mission Control • Heap dumps can be analyzed using – Jhat – Java VisualVM – Eclipse MAT – JOverflow JMC plugin 13 analyze data
  12. 12. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 14
  13. 13. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 15
  14. 14. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | OutOfMemoryError: PermGen Space (*removed in JDK8*) • Appropriately configure the PermGen Size with –XX:PermSize=n and – XX:MaxPermSize=m • Make sure that classes are getting unloaded -XX:+TraceClassUnloading –XX:+TraceClassLoading • -Xnoclassgc • CMS: enable class unloading with • –XX:+CMSClassUnloadingEnabled • Monitor using Java Mission Control or JConsole • jmap and SA tool to examine PermGen • Heap Dumps help here too 16
  15. 15. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 17 $ jmap -permstat 29620 Attaching to process ID 29620, please wait... Debugger attached successfully. Client compiler detected. JVM version is 24.85-b06 12674 intern Strings occupying 1082616 bytes. finding class loader instances .. done. computing per loader stat ..done. please wait.. computing liveness.........................................done. class_loader classes bytes parent_loader alive? type <bootstrap> 1846 5321080 null live <internal> 0xd0bf3828 0 0 null live sun/misc/Launcher$ExtClassLoader@0xd8c98c78 0xd0d2f370 1 904 null dead sun/reflect/DelegatingClassLoader@0xd8c22f50 0xd0c99280 1 1440 null dead sun/reflect/DelegatingClassLoader@0xd8c22f50 0xd0b71d90 0 0 0xd0b5b9c0 live java/util/ResourceBundle$RBClassLoader@0xd8d042e8 0xd0d2f4c0 1 904 null dead sun/reflect/DelegatingClassLoader@0xd8c22f50 0xd0b5bf98 1 920 0xd0b5bf38 dead sun/reflect/DelegatingClassLoader@0xd8c22f50 0xd0c99248 1 904 null dead sun/reflect/DelegatingClassLoader@0xd8c22f50 0xd0d2f488 1 904 null dead sun/reflect/DelegatingClassLoader@0xd8c22f50 0xd0b5bf38 6 11832 0xd0b5b9c0 dead sun/reflect/misc/MethodUtil@0xd8e8e560 0xd0d2f338 1 904 null dead sun/reflect/DelegatingClassLoader@0xd8c22f50 0xd0d2f418 1 904 null dead sun/reflect/DelegatingClassLoader@0xd8c22f50 0xd0d2f3a8 1 904 null dead sun/reflect/DelegatingClassLoader@0xd8c22f50 0xd0b5b9c0 317 1397448 0xd0bf3828 live sun/misc/Launcher$AppClassLoader@0xd8cb83d8 0xd0d2f300 1 904 null dead sun/reflect/DelegatingClassLoader@0xd8c22f50 0xd0d2f3e0 1 904 null dead sun/reflect/DelegatingClassLoader@0xd8c22f50 0xd0ec3968 1 1440 null dead sun/reflect/DelegatingClassLoader@0xd8c22f50 0xd0e0a248 1 904 null dead sun/reflect/DelegatingClassLoader@0xd8c22f50 0xd0c99210 1 904 null dead sun/reflect/DelegatingClassLoader@0xd8c22f50 0xd0d2f450 1 904 null dead sun/reflect/DelegatingClassLoader@0xd8c22f50 0xd0d2f4f8 1 904 null dead sun/reflect/DelegatingClassLoader@0xd8c22f50 0xd0e0a280 1 904 null dead sun/reflect/DelegatingClassLoader@0xd8c22f50 total = 22 2186 6746816 N/A alive=4, dead=18 N/A jmap -permstat
  16. 16. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 18 java -classpath %JAVA_HOME%libsa-jdi.jar sun.jvm.hotspot.tools.PermStat 1367 Attaching to process ID 1367, please wait... Debugger attached successfully. Client compiler detected. JVM version is 24.0-b56 10713 intern Strings occupying 802608 bytes. finding class loader instances .. done. computing per loader stat ..done. please wait.. computing liveness..............................................done. class_loader classes bytes parent_loader alive? type <bootstrap> 342 1539808 null live <internal> 0x23f7b398 3 28016 0x23f762e0 live sun/misc/Launcher$AppClassLoader@0x38a0e9c0 0x23f762e0 0 0 null live sun/misc/Launcher$ExtClassLoader@0x389eb420 total = 3 345 1567824 N/A alive=3, dead=0 N/A SA Tool for PermGen Inspection
  17. 17. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 19
  18. 18. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 20 Metaspace (since JDK 8) • Java Classes metadata is stored in Metaspace • Increase metaspace size with MaxMetaspaceSize 2. java.lang.OutOfMemoryError: Compressed class space • CompressedClassSpaceSize • Default size is 1G 1. OutOfMemoryError: Metaspace • jmap -clstats • jcmd GC.class_stats • -XX:+PrintGCDetails • NMT, JConsole, VisualVM Diagnostic data and Tools
  19. 19. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | OutOfMemoryError: Native heap • 32-bit JVM, running out of address space • Strange when you get this error with 64bit JVM • Failure with CompressedOops – Unscaled Compressed Oops: Java heap placed below 4GB address space – Zero Based Compressed Oops: Java heap placed below 32GB address space • -XX:HeapBaseMinAddress=n • https://blogs.oracle.com/poonam/entry/running_on_a_64bit_platform 21
  20. 20. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 22 # # There is insufficient memory for the Java Runtime Environment to continue. # Native memory allocation (malloc) failed to allocate 5408 bytes for CodeCache: no room for vtable chunks # Possible reasons: # The system is out of physical RAM or swap space # In 32 bit mode, the process size limit was hit # Possible solutions: # Reduce memory load on the system # Increase physical memory or swap space # Check if swap backing store is full # Use 64 bit Java on a 64 bit OS # Decrease Java heap size (-Xmx/-Xms) # Decrease number of Java threads # Decrease Java thread stack sizes (-Xss) # Set larger code cache with -XX:ReservedCodeCacheSize= # This output file may be truncated or incomplete. # # Out of Memory Error (vtableStubs.cpp:63), pid=23796, tid=246 # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.80-b11 mixed mode solaris-sparc compressed oops) # Current thread (0x000000010dccb130): JavaThread "[ACTIVE] ExecuteThread: '190' for queue: 'weblogic.kernel.Default (self-tuning)'" daemon [_thread_in_vm, id=246, stack(0xffffffff47e00000,0xffffffff47e80000)] Stack: [0xffffffff47e00000,0xffffffff47e80000], sp=0xffffffff47e74820, free space=466k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0xcc7b44] void VMError::report_and_die()+0x774 V [libjvm.so+0x62fc58] void report_vm_out_of_memory(const char*,int,unsigned long,const char*)+0x88 V [libjvm.so+0xcd72f8] void*VtableStub::operator new(unsigned long,int)+0xb8 V [libjvm.so+0xcd8844] VtableStub*VtableStubs::create_itable_stub(int)+0x74 V [libjvm.so+0xcd764c] unsigned char*VtableStubs::create_stub(bool,int,methodOopDesc*)+0x124 V [libjvm.so+0x2fbc90] void CompiledIC::set_to_megamorphic(CallInfo*,Bytecodes::Code,Thread*)+0x50 V [libjvm.so+0x2fe4d4] methodHandle SharedRuntime::handle_ic_miss_helper(JavaThread*,Thread*)+0x3f4 V [libjvm.so+0xbb98e8] unsigned char*SharedRuntime::handle_wrong_method_ic_miss(JavaThread*)+0x30 ………
  21. 21. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 23 hsdb> universe Heap Parameters: ParallelScavengeHeap [ PSYoungGen [ eden = [0x0000000700000000,0x00000007007ae238,0x000000070c000000] , from = [0x000000070e0000 00,0x000000070e000000,0x0000000710000000] , to = [0x000000070c000000,0x000000070c000000,0x000000070e000000] ] PSOldGen [ [0x0000000500400000,0x0000000500400000,0x0000000520200000] ] PSPermGen [ [0x00000004fb200000,0x00000004fb483380,0x00000004fc800000] ] bash-4.1$ pmap 6049 0000000000400000 4K r-x-- /java/bin/amd64/java 0000000000410000 4K rw--- /java/bin/amd64/java 0000000000411000 2288K rw--- [ heap ] 00000004FB200000 22528K rw--- [ anon ]  Java Heap starts here 0000000500400000 522240K rw--- [ anon ] 0000000700000000 262144K rw--- [ anon ]
  22. 22. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 24 # # A fatal error has been detected by the Java Runtime Environment: # Possible reasons: # The system is out of physical RAM or swap space # The process is running with CompressedOops enabled, and the Java Heap may be blocking the growth of the native heap # Possible solutions: # Reduce memory load on the system # Increase physical memory or swap space # Check if swap backing store is full # Decrease Java heap size (-Xmx/-Xms) # Decrease number of Java threads # Decrease Java thread stack sizes (-Xss) # Set larger code cache with -XX:ReservedCodeCacheSize= # JVM is running with Unscaled Compressed Oops mode in which the Java heap is # placed in the first 4GB address space. The Java Heap base address is the # maximum limit for the native heap growth. Please use -XX:HeapBaseMinAddress # to set the Java Heap base and to place the Java Heap above 4GB virtual address # This output file may be truncated or incomplete. Improved native memory error message Error message improved http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/2f77f01444dd
  23. 23. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Native Heap Leak • Native Memory Tracker (NMT) – -XX:NativeMemoryTracking=off, summary, detail – -XX:+UnlockDiagnosticVMOptions -XX:+PrintNMTStatistics – jcmd <pid> VM.native_memory • Platform specific debugging tools – dbx, libumem, valgrind, purify 25 Tools
  24. 24. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 26 d:tests>jcmd 90172 VM.native_memory 90172: Native Memory Tracking: Total: reserved=3431296KB, committed=2132244KB - Java Heap (reserved=2017280KB, committed=2017280KB) (mmap: reserved=2017280KB, committed=2017280KB) - Class (reserved=1062088KB, committed=10184KB) (classes #411) (malloc=5320KB #190) (mmap: reserved=1056768KB, committed=4864KB) - Thread (reserved=15423KB, committed=15423KB) (thread #16) (stack: reserved=15360KB, committed=15360KB) (malloc=45KB #81) (arena=18KB #30) - Code (reserved=249658KB, committed=2594KB) (malloc=58KB #348) (mmap: reserved=249600KB, committed=2536KB) - GC (reserved=79628KB, committed=79544KB) (malloc=5772KB #118) (mmap: reserved=73856KB, committed=73772KB) - Compiler (reserved=138KB, committed=138KB) (malloc=8KB #41) (arena=131KB #3) - Internal (reserved=5380KB, committed=5380KB) (malloc=5316KB #1357) (mmap: reserved=64KB, committed=64KB) - Symbol (reserved=1367KB, committed=1367KB) (malloc=911KB #112) (arena=456KB #1) - Native Memory Tracking (reserved=118KB, committed=118KB) (malloc=66KB #1040) (tracking overhead=52KB) - Arena Chunk (reserved=217KB, committed=217KB) (malloc=217KB)
  25. 25. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Latencies
  26. 26. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Latencies • Long GC Pauses • Safepoint operations – Stop-the-world pauses to execute VM operations for its internal maintenance – e.g. Deoptimization, RevokeBias, JFRCheckpoint • Other JVM tasks – E.g. CodeCache cleaning • Debug operations e.g dumping stack trace • I/O 28 understand causes
  27. 27. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Latencies • Java Flight Recording – events taking long time • Collect GC logs • Collect stack traces including native frames • -XX:+PrintGCApplicationStoppedTime • -XX:+PrintSafepointStatistics –XX:PrintSafepointStatisticsCount=1 29 collect diagnostic data
  28. 28. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | JFR - events with long durations 30
  29. 29. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 31 vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count 2.743: RedefineClasses [ 6 0 0 ] [ 0 0 0 0 4 ] 0 vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count 2.815: JFRCheckpoint [ 7 0 0 ] [ 0 0 0 0 0 ] 0 vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count 2.816: PrintThreads [ 7 0 0 ] [ 0 0 0 0 0 ] 0 vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count 2.816: PrintJNI [ 7 0 0 ] [ 0 0 0 0 0 ] 0 vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count 2.817: FindDeadlocks [ 7 0 0 ] [ 0 0 0 0 0 ] 0 vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count 3.817: no vm operation [ 11 0 1 ] [ 0 5 5 0 0 ] 0 vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count 7.174: EnableBiasedLocking [ 11 0 0 ] [ 0 0 0 0 0 ] 0 vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count 102.319: PrintThreads [ 11 0 2 ] [ 0 0 0 0 0 ] 0 vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count 102.322: PrintJNI [ 11 0 0 ] [ 0 0 0 0 0 ] 0 Safepoints
  30. 30. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | GC Pauses • Collect GC logs with -XX:+PrintGCDetails -XX:+PrintHeapAtGC - XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps and - XX:+PrintGCApplicationStoppedTime • Use GCHisto, VisualGC, gclogviewer or other GC logs analysis tools • Monitor the overall health of the system using OS tools like vmstat, iostat, netstat and mpstat etc. on Solaris and Linux platforms, and tools like Process Monitor and Task Manager on the Windows operating system • CMS collector, add option -XX:PrintFLSStatistics=2 • Inspect GC logs to see any signs of fragmentation • Monitor if the specified heap size is enough to contain the footprint of the application. 32 how to approach
  31. 31. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | GC Pauses 166687.013: [Full GC [PSYoungGen: 126501K->0K(922048K)] [PSOldGen: 2063794K->1598637K(2097152K)] 2190295K->1598637K(3019200K) [PSPermGen: 165840K->164249K(166016K)], 6.8204928 secs] [Times: user=6.80 sys=0.02, real=6.81 secs] 166699.015: [Full GC [PSYoungGen: 125518K->0K(922048K)] [PSOldGen: 1763798K->1583621K(2097152K)] 1889316K->1583621K(3019200K) [PSPermGen: 165868K->164849K(166016K)], 4.8204928 secs] [Times: user=4.80 sys=0.02, real=4.81 secs] 33 insufficient memory spaces
  32. 32. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 34
  33. 33. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 35 (to-space exhausted), 2.8504662 secs] [Parallel Time: 2778.5 ms, GC Workers: 16] [GC Worker Start (ms): Min: 122158804.8, Avg: 122158805.1, Max: 122158805.3, Diff: 0.5] [Ext Root Scanning (ms): Min: 869.1, Avg: 896.0, Max: 952.5, Diff: 83.4, Sum: 14335.3] [Update RS (ms): Min: 18.4, Avg: 27.0, Max: 34.6, Diff: 16.2, Sum: 431.5] [Processed Buffers: Min: 18, Avg: 33.0, Max: 48, Diff: 30, Sum: 528] [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.3] [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] [Object Copy (ms): Min: 1805.5, Avg: 1854.5, Max: 1878.0, Diff: 72.5, Sum: 29671.2] [Termination (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: 3.0] [GC Worker Other (ms): Min: 0.0, Avg: 0.2, Max: 0.4, Diff: 0.3, Sum: 2.8] [GC Worker Total (ms): Min: 2777.4, Avg: 2777.8, Max: 2778.0, Diff: 0.6, Sum: 44444.1] [GC Worker End (ms): Min: 122161582.7, Avg: 122161582.8, Max: 122161583.0, Diff: 0.3] [Code Root Fixup: 8.4 ms] [Code Root Migration: 0.0 ms] [Clear CT: 0.5 ms] [Other: 63.0 ms] [Choose CSet: 0.0 ms] [Ref Proc: 1.1 ms] [Ref Enq: 0.0 ms] [Free CSet: 0.4 ms] [Eden: 64.0M(792.0M)->0.0B(920.0M) Survivors: 128.0M->0.0B Heap: 18.1G(18.1G)->18.1G(18.1G)] [Times: user=25.31 sys=1.01, real=2.85 secs] GC log
  34. 34. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | GC Pauses 164638.058: [Full GC (System) [PSYoungGen: 22789K->0K(992448K)] [PSOldGen: 1645508K->1666990K(2097152K)] 1668298K->1666990K(3089600K) [PSPermGen: 164914K->164914K(166720K)], 5.7499132 secs] [Times: user=5.69 sys=0.06, real=5.75 secs] • -Dsun.rmi.dgc.server.gcInterval=n -Dsun.rmi.dgc.client.gcInterval=n • kill -3 with -XX:+PrintClassHistogram • -XX:+DisableExplicitGC 36 explicit GCs
  35. 35. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | GC Pauses - High ‘sys’ time 2013-07-17T03:58:06.601-0700: 51522.120: [GC: 2696384K->449344K(2696384K), 29.4779282 secs] 4557003K->2326821K(12133568K) ,29.4795222 secs] [Times: user=1.56 sys=26.35, real=29.48 secs] 37 Corresponding 'vmstat' output at 03:58: kthr memory page disk faults cpu r b w swap free re mf pi po fr de sr s0 s1 s2 s3 in sy cs us sy id 20130717_035806 0 0 0 77611960 94847600 55 266 0 0 0 0 0 0 0 0 0 3041 2644 2431 44 8 48 20130717_035815 0 0 0 76968296 94828816 79 324 0 18 18 0 0 0 0 1 0 3009 3642 2519 59 13 28 20130717_035831 1 0 0 77316456 94816000 389 2848 0 7 7 0 0 0 0 2 0 40062 78231 61451 42 6 53
  36. 36. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | GC Pauses - low ‘user’, low ‘sys’ time • GC threads stuck waiting for Kernel I/O • File System flushes • FS flushing may be invoked due to log files rotation 38
  37. 37. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Hung Processes or Stuck Threads
  38. 38. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Hung Processes or Stuck Threads • Check if Java level deadlock – Ctrl+ or Ctrl+Break • Stack trace with jstack – jstack –m to get the native frames – jstack –l to get concurrent locks information – jstack –F • Check what VMThread is doing – SafepointSynchronize::begin – Performing a VM Operation • Network or I/O operations • Collect Core file to investigate the threads stuck in native code (JNI or JVM) • It may appear that the thread(s) are stuck in a particular java method but that method may be getting invoked repeatedly from a loop • Thread may appear to be stuck in a Java method but may be doing something else at the native level e.g. threads stuck in looking for space in CodeCache. 40
  39. 39. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 41 ----------------- 11628 ----------------- 0x0000003fe900af59 ???????? 0x00002ac4ae68f258 _ZN7Monitor5ILockEP6Thread + 0x1b8 0x00002ac4ae68f48f _ZN7Monitor28lock_without_safepoint_checkEv + 0x2f 0x00002ac4ae2ce713 _ZN9CodeCache18largest_free_blockEv + 0x33 0x00002ac4ae2f97a5 _ZN13CompileBroker14compile_methodE12methodHandleiiS0_iPKcP6Th read + 0x3b5 0x00002ac4ae1443cd _ZN23AdvancedThresholdPolicy14submit_compileE12methodHandlei9C ompLevelP10JavaThread + 0x8d 0x00002ac4ae77fab0 _ZN21SimpleThresholdPolicy5eventE12methodHandleS0_ii9CompLevel P7nmethodP10Java Thread + 0x250 0x00002ac4ae4a39d7 _ZN18InterpreterRuntime32frequency_counter_overflow_innerEP10J avaThreadPh + 0x157 0x00002ac4ae4a79d6 _ZN18InterpreterRuntime26frequency_counter_overflowEP10JavaThr eadPh + 0x16 0x00002aaaab1a347a * javax.management.MBeanInfo.getClassName() bci:0 line:289 (Interpreted frame) Example: native stack trace
  40. 40. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 42 java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for < 0x00000006e95c8088 > (a java.util.concurrent.locks.ReentrantLock$NonfairSync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834 ) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197) at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:214) at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290) at oracle.jbo.mom.DefinitionContext.lock(DefinitionContext.java:550) at oracle.jbo.mom.DefinitionManager.getSiteLock(DefinitionManager.java:4710) at oracle.jbo.mom.DefinitionManager.lockDefinitionContext(DefinitionManager.java:4667) at oracle.jbo.mom.DefinitionContextAgeable.cleanNullRefs(DefinitionContextAgeable.java:253) Concurrent Locks
  41. 41. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Concurrent locks information Thread 9409: (state = IN_NATIVE) - java.net.SocketInputStream.socketRead0(java.io.FileDescriptor, byte[], int, int, int) @bci=0 - java.net.SocketInputStream.read(byte[], int, int, int) @bci=87, line=152 (Compiled frame) - java.net.SocketInputStream.read(byte[], int, int) @bci=11, line=122 (Compiled frame) - oracle.net.nt.MetricsEnabledInputStream.read(byte[], int, int) @bci=38, line=730 (Compiled frame) - oracle.net.ns.Packet.receive() @bci=180, line=302 (Compiled frame) - oracle.net.ns.DataPacket.receive() @bci=1, line=108 (Compiled frame) - oracle.net.ns.NetInputStream.getNextPacket() @bci=48, line=325 (Compiled frame) - oracle.net.ns.NetInputStream.read(byte[], int, int) @bci=33, line=269 (Compiled frame) - oracle.net.ns.NetInputStream.read(byte[]) @bci=5, line=191 (Compiled frame) ……… - weblogic.timers.internal.TimerImpl.run() @bci=91, line=284 (Compiled frame) - weblogic.work.SelfTuningWorkManagerImpl$WorkAdapterImpl.run() @bci=4, line=550 (Compiled frame) - weblogic.work.ExecuteThread.execute(java.lang.Runnable) @bci=34, line=263 (Compiled frame) - weblogic.work.ExecuteThread.run() @bci=42, line=221 (Interpreted frame) Locked ownable synchronizers: - <0x00000006e95c8088>, (a java/util/concurrent/locks/ReentrantLock$NonfairSync) 43 jstack -l
  42. 42. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | High CPU Usage
  43. 43. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | High CPU Usage • Java Flight Recordings • Find the thread that is responsible consuming CPU cycles – prstat –L – ps, top (shift H) • Get the Thread ID from the above output • Get stack trace with jstack • Find the stack trace of the thread consuming CPU 45
  44. 44. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | prstat -L bash-3.2$ prstat -L PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/LWPID 704 pobajaj 505M 36M cpu18 0 0 0:01:50 3.1% java/48 21592 root 71M 42M sleep 60 0 16:26:30 1.7% obndmpd/1 9990 root 2016K 1672K sleep 60 0 0:32:23 0.5% find/1 21590 root 45M 15M sleep 59 0 0:38:41 0.0% obndmpd/1 725 pobajaj 4264K 3912K cpu5 39 0 0:00:00 0.0% prstat/1 704 pobajaj 505M 36M sleep 45 0 0:00:00 0.0% java/46 46
  45. 45. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 47 bash-3.2$ jstack 704 2016-08-14 07:43:15 Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.60-b23 mixed mode): "Attach Listener" #24 daemon prio=9 os_prio=64 tid=0x0000000100428000 nid=0x31 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "DestroyJavaVM" #23 prio=5 os_prio=64 tid=0x000000010010f000 nid=0x2 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "Thread2" #22 prio=5 os_prio=64 tid=0x000000010045c000 nid=0x30 runnable [0xfffffffe52aff000] java.lang.Thread.State: RUNNABLE at Thread2.run(TestCPU.java:25) at java.lang.Thread.run(Thread.java:745) "Service Thread" #20 daemon prio=9 os_prio=64 tid=0x0000000100410000 nid=0x2d runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE jstack output
  46. 46. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Application Crashes 48
  47. 47. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Application Crashes • Crash could be in: – Native Code – Compiled Code or Hotspot Compiler thread – VM Thread – GC Threads • First thing to look at is the hs_err log file 49 understand cause # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x00007f438b2c86ad, pid=8359, tid=139928047384320 # # JRE version: Java(TM) SE Runtime Environment (8.0_40-b19) (build 1.8.0_40-ea-b19) # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.40-b23 mixed mode linux-amd64 compressed oops) # Problematic frame: # V [libjvm.so+0x5b86ad] G1ParScanThreadState::copy_to_survivor_space(oopDesc*)+0x3d # Current thread (0x00036960): JavaThread "main" [_thread_in_vm, id=2896] Stack: [0x00040000,0x00080000), sp=0x0007f9f8, free space=254k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [jvm.dll+0x83d77] C [TestApp.dll+0x1047] j Test.methodA()V+0 j Test.main([Ljava/lang/String;)V+0 v ~StubRoutines::call_stub V [jvm.dll+0x80f13]
  48. 48. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Crashes • hs_err log file • Core files • Java Flight Recording (DumpJFR) • GC Logs • -XX:+VerifyBeforeGC –XX:+VerifyAfterGC –XX:+VerifyDuringGC –XX:VerifySubset=“subset string” • Transported core files – Need to get system libraries from the system where the crash happened – Tell native debuggers where to look for those libs • -XX:OnError=“command” -XX:OnError=“jcmd %p JFR.dump recording=1” 50 collect diagnostic data
  49. 49. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Crashes • -Xcheck:jni • Remove –Xverify:none or –noverify • Java Mission Control • Serviceability Agent • Native debugging tools: dbx, gdb, windbg etc. 51 Options and Tools If JVM or JRE library crash Submit a Bug Report
  50. 50. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | HotSpot Serviceability Agent • Debugger for HotSpot JVM • Platform independent tool, can attach to live Java processes and core files • Understands the Java Objects and VM Structures • HSDB, CLHSDB and some other useful utilities • bin/jhsdb coming up in JDK 9 • java –cp $JAVA_HOME/lib/sa-jdi.jar sun.jvm.hotspot.CLHSDB hsdb> inspect 0x787696d68 instance of Oop for sun/reflect/GeneratedSerializationConstructorAccessor924844 @ 0x0000000787696d68 @ 0x0000000787696d68 (size = 16) _mark: 5 _metadata._compressed_klass: InstanceKlass for sun/reflect/GeneratedSerializationConstructorAccessor924844 hsdb> mem 0x787696d68 10 0x0000000787696d68: 0x0000000000000005 0x0000000787696d70: 0x00000000f8d15005 hsdb> universe Heap Parameters: ParallelScavengeHeap [ PSYoungGen [ eden = [0x0000000780000000,0x00000007a29ec538,0x00000007a9500000] , from = [0x00000007a9500000,0x00000007a9500000,0x00000007b4a80000] , to = [0x00000007b4a80000,0x00000007b4a80000,0x00000007c0000000] ] PSOldGen [ [0x0000000700000000,0x000000077ffb2ae8,0x0000000780000000] ] ] 52
  51. 51. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 53 c:Javajdk-9bin>jhsdb clhsdb command line debugger hsdb ui debugger jstack --help to get more information jmap --help to get more information jinfo --help to get more information jsnap --help to get more information c:Javajdk-9bin>jhsdb jmap --help <no option> to print same info as Solaris pmap --heap to print java heap summary --binaryheap to dump java heap in hprof binary format --dumpfile name of the dump file --histo to print histogram of java object heap --clstats to print class loader statistics --finalizerinfo to print information on objects awaiting finalization --exe executable image name --core path to coredump --pid pid of process to attach c:Javajdk-9bin>jhsdb jmap --pid 6472 --clstats Attaching to process ID 6472, please wait... Debugger attached successfully. Server compiler detected. JVM version is 9-ea+124 finding class loader instances ..done. computing per loader stat ..done. please wait.. computing liveness.............................................................done. class_loader classes bytes parent_loader alive? type <bootstrap> 744 2283344 null live <internal> 0x000000008c885848 1 529 null live java/lang/invoke/MethodHandles$LookupHelper$1@0x00000000242d5a00 0x000000008c9c8c68 0 0 null live jdk/internal/loader/ClassLoaders$BootClassLoader@0x00000000242d0048 0x000000008c9c9048 2 2552 0x000000008c9c8c68 live jdk/internal/loader/ClassLoaders$PlatformClassLoader@0x0000000024290150 0x000000008c9c9430 1 972 0x000000008c9c9048 live jdk/internal/loader/ClassLoaders$AppClassLoader@0x000000002428fd18 total = 5 748 2287397 N/A alive=5, dead=0 N/A
  52. 52. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Working around crashes • Compiled code or JIT compiler crash – Try excluding method – e.g. -XX:CompileCommand=exclude,java/lang/String.indexOf • GC crash: – Try alternate GC 54
  53. 53. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Java Exceptions
  54. 54. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Java Exceptions • Exception Message • Classes related exceptions/errors e.g InvalidClassException • -XX:AbortVMOnException=<exception> • - XX:AbortVMOnExceptionMessage=<exception message> • Java Flight Recordings 56
  55. 55. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Best Practices for Troubleshooting HotSpot JVM • Enable Core files • ulimit -c unlimited • -XX:+ CreateMinidumpOnCrash • -XX:+HeapDumpOnOutOfMemoryError • -XX:GCTimeLimit /-XX:GCHeapFreeLimit with Parallel collector • Continuous Java Flight Recording • -XX:+UnlockCommercialFeatures -XX:+FlightRecorder -XX:FlightRecorderOptions=defaultrecording=true • Enable GC logging • Enable JMX for remote monitoring • -Dcom.sun.management.jmxremote=true • -Dcom.sun.management.jmxremote.port=n 57
  56. 56. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Summary of Tools • Java Mission Control and Java Flight Recorder • jcmd • jmap • jconsole • jinfo • jstack • Java VisualVM • Serviceability Agent • Native debuggers 58
  57. 57. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Troubleshooting Three simple steps: 1. Understand the problem/error 2. Collect the required diagnostic data 3. Analyze the collected data 59
  58. 58. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | • Troubleshooting Guide • http://docs.oracle.com/javase/8/docs/tec hnotes/tools/unix/java.html • http://openjdk.java.net/groups/hotspot/ docs/Serviceability.html • https://blogs.oracle.com/poonam 60 Useful Resources

×