Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Profiling distributed Java applications

709 vues

Publié le

English:
Sometimes java applications are not as fast as we expect. What if our system contains of hundreds JVMs, databases and other components? We try using profiler, however can't find the bottleneck.

In this talk we discuss:
1. How to profile single JVM
- what is profiler and how does it work
- write simple profiler using java agent and byte-code modification
2. How to profile distributed system
- how engineers in Google doing this
- look at commercial and open source solutions: Dynatrace and Zipkin
- connect to demo-system and see live demos

Russian:
Порой наши java-программы работают медленней, чем хотелось бы. Иногда мы даже подключаемся профайлером, чтобы посмотреть, где тормозит.

А что если наша система состоит из десятков/сотен JVM, баз данных и других компонентов?

На этом техтолке мы обсудим:
1. Как профилировать одну JVM
- что такое профайлер и как он работает под капотом
- напишем простой профайлер с помощью java-агента и байт-код модификаций
2. Как профилировать сложную распределённую систему
- разберёмся как это делают инженеры в Google
- посмотрим готовые коммерческие и open-source решения: Dynatrace и Zipkin
- подключимся к демо-системе и увидим всё своими глазами

https://github.com/kslisenko/java-performance

Publié dans : Technologie
  • Soyez le premier à commenter

Profiling distributed Java applications

  1. 1. 1CONFIDENTIAL PROFILING DISTRIBUTED JAVA APPLICATIONS KANSTANTSIN SLISENKA LEAD SOFTWARE ENGINEER MAY 25, 2017
  2. 2. 2CONFIDENTIAL Kanstantsin Slisenka Java Backend Developer Speaker at Tech Talks, IT Week ABOUT ME skype: kslisenko kslisenko@gmail.com kanstantsin_slisenka@epam.com
  3. 3. 3CONFIDENTIAL WHAT IS COMMON?
  4. 4. 4CONFIDENTIAL AGENDA Profiling single JVM1 How profilers work Java agents (live demo) Google experience Dynatrace, Zipkin (live demo) Profiling distributed systems2
  5. 5. 5CONFIDENTIAL PROFILING SINGLE JVM
  6. 6. 6CONFIDENTIAL “You can’t measure performance of Java code not interfering with JVM”
  7. 7. 7CONFIDENTIAL https://zeroturnaround.com/rebellabs/top-5-java-profilers-revealed-real-world-data-with-visualvm-jprofiler-java-mission-control-yourkit-and-custom-tooling/
  8. 8. 8CONFIDENTIAL Is profiler honest?
  9. 9. 9CONFIDENTIAL Is profiler honest? NO!* *Measured performance = (app performance + profiler overhead) * profiler accuracy
  10. 10. 10CONFIDENTIAL MEASURING TIME System.currentTimeMillis() System.nanoTime() Spend time, not always accurate 1. Use benchmarks http://openjdk.java.net/projects/code-tools/jmh/ 2. Warm-up your JVM, … https://shipilev.net/talks/jpoint-April2014-benchmarking.pdf https://shipilev.net/blog/2014/nanotrusting-nanotime/
  11. 11. 11CONFIDENTIAL public void main() { a(); // 100 ms Thread.sleep(200); b(); // 100 ms // GC is running – 50ms c(); // 100 ms } CPU VS WALL-CLOCK TIME
  12. 12. 12CONFIDENTIAL Wall-clock time As much as it takes to execute 100 + 200 + 100 + 50 + 100 = 550 ms public void main() { a(); // 100 ms Thread.sleep(200); b(); // 100 ms // GC is running – 50ms c(); // 100 ms } CPU VS WALL-CLOCK TIME
  13. 13. 13CONFIDENTIAL Wall-clock time As much as it takes to execute 100 + 200 + 100 + 50 + 100 = 550 ms CPU time Time CPU was busy 100 + 100 + 100 = 300 ms public void main() { a(); // 100 ms Thread.sleep(200); b(); // 100 ms // GC is running – 50ms c(); // 100 ms } CPU VS WALL-CLOCK TIME
  14. 14. 14CONFIDENTIAL JVM DIAGNOSTIC INTERFACES • JVMTI (native С++ API) • Attach API • jstack, jmap, jps, … • Performance counters • Heap Dumps • Flight Recorder • JMX – java.lang.management – custom MBeans • Java Agents – java.lang.instrument github.com/aragozin/jvm-tools
  15. 15. 15CONFIDENTIAL JAVA.LANG.MANAGEMENT github.com/kslisenko/java-performance/tree/master/java-agent-monitoring
  16. 16. 16CONFIDENTIAL ThreadMXBean threadMBean = ManagementFactory.getThreadMXBean(); System.out.println("Thread count = " + threadMBean.getThreadCount()); ThreadInfo[] threads = threadMBean .dumpAllThreads(true, true); for (ThreadInfo thread : threads) { System.out.println(thread); }
  17. 17. 17CONFIDENTIAL ThreadMXBean threadMBean = ManagementFactory.getThreadMXBean(); System.out.println("Thread count = " + threadMBean.getThreadCount()); ThreadInfo[] threads = threadMBean .dumpAllThreads(true, true); for (ThreadInfo thread : threads) { System.out.println(thread); }
  18. 18. 18CONFIDENTIAL ThreadMXBean threadMBean = ManagementFactory.getThreadMXBean(); System.out.println("Thread count = " + threadMBean.getThreadCount()); ThreadInfo[] threads = threadMBean .dumpAllThreads(true, true); for (ThreadInfo thread : threads) { System.out.println(thread); }
  19. 19. 19CONFIDENTIAL Thread dumps in regular intervals c() b() a() main() SAMPLING
  20. 20. 20CONFIDENTIAL Thread dumps in regular intervals Injection of measurement code INSTRUMENTATION c() b() a() main() SAMPLING c() b() a() main()
  21. 21. 21CONFIDENTIAL Thread dumps in regular intervals Overhead depends on sampling interval Injection of measurement code Overhead depends on speed of measurement code INSTRUMENTATION c() b() a() main() SAMPLING c() b() a() main()
  22. 22. 22CONFIDENTIAL Thread dumps in regular intervals Overhead depends on sampling interval relatively small overhead can be used for unknown code Injection of measurement code Overhead depends on speed of measurement code accuracy (we measure each execution) we can modify the code also INSTRUMENTATION c() b() a() main() SAMPLING c() b() a() main()
  23. 23. 23CONFIDENTIAL Thread dumps in regular intervals Overhead depends on sampling interval relatively small overhead can be used for unknown code accuracy (probability-based approach) triggers JVM safe-points Injection of measurement code Overhead depends on speed of measurement code accuracy (we measure each execution) we can modify the code also relatively big overhead we must know the code we are instrumenting INSTRUMENTATION c() b() a() main() SAMPLING c() b() a() main()
  24. 24. 24CONFIDENTIAL How to capture thread dump 1. jstack -l JAVA_PID 2. ManagementFactory.getThreadMXBean() .dumpAllThreads(true, true); 3. JVMTI AsyncGetCallTrace SAMPLING
  25. 25. 25CONFIDENTIAL How to capture thread dump 1. jstack -l JAVA_PID 2. ManagementFactory.getThreadMXBean() .dumpAllThreads(true, true); 3. JVMTI AsyncGetCallTrace SAMPLING JVM goes to safe-point • Application threads are paused • We never see the code where safe-point never happens Does not trigger safe-points
  26. 26. 26CONFIDENTIAL How to capture thread dump 1. jstack -l JAVA_PID 2. ManagementFactory.getThreadMXBean() .dumpAllThreads(true, true); 3. JVMTI AsyncGetCallTrace SAMPLING Doesn’t trigger safe-points github.com/jvm-profiling-tools/honest-profiler JVM goes to safe-point • Application threads are paused • We never see the code where safe-point never happens
  27. 27. 27CONFIDENTIAL Safe-points > jstack –l JAVA_PID Total time for which application threads were stopped: 0.0132329 seconds, Stopping threads took: 0.0007617 seconds Total time for which application threads were stopped: 0.0002887 seconds, Stopping threads took: 0.0000385 seconds -XX:+PrintGCApplicationStoppedTime -XX:+PrintSafepointStatistics -XX:PrintSafepointStatisticsCount=1
  28. 28. 28CONFIDENTIAL INSTRUMENTATION .java source code
  29. 29. 29CONFIDENTIAL INSTRUMENTATION .java source code dropwizard metrics Perf4J long start = System.currentTimeInMillis(); // Your code goes here long finish = System.currentTimeInMillis(); System.out.println(start - finish);
  30. 30. 30CONFIDENTIAL INSTRUMENTATION .java .class source code byte code compilation dropwizard metrics Perf4J long start = System.currentTimeInMillis(); // Your code goes here long finish = System.currentTimeInMillis(); System.out.println(start - finish);
  31. 31. 31CONFIDENTIAL proxy classes generation INSTRUMENTATION .java .class source code byte code compilation AspectJ compiler dropwizard metrics Perf4J long start = System.currentTimeInMillis(); // Your code goes here long finish = System.currentTimeInMillis(); System.out.println(start - finish);
  32. 32. 32CONFIDENTIAL proxy classes generation INSTRUMENTATION AspectJ compiler .java .class source code byte code byte code in runtime compilation loading rt.jar lib/ext bootstrap extension classpath application dropwizard metrics Perf4J long start = System.currentTimeInMillis(); // Your code goes here long finish = System.currentTimeInMillis(); System.out.println(start - finish);
  33. 33. 33CONFIDENTIAL byte code in runtime rt.jar lib/ext bootstrap extension classpath application proxy classes generation Frameworks INSTRUMENTATION .java .class source code byte code compilation loading AspectJ compiler ASM Javassist CGLibAspectJ BCEL dropwizard metrics Perf4J long start = System.currentTimeInMillis(); // Your code goes here long finish = System.currentTimeInMillis(); System.out.println(start - finish);
  34. 34. 34CONFIDENTIAL proxy classes generation Frameworks INSTRUMENTATION .java .class source code byte code compilation loading AspectJ compiler ASM Javassist CGLibAspectJ BCEL Custom ClassLoader dropwizard metrics Perf4J byte code in runtime rt.jar lib/ext bootstrap extension classpath custom application long start = System.currentTimeInMillis(); // Your code goes here long finish = System.currentTimeInMillis(); System.out.println(start - finish);
  35. 35. 35CONFIDENTIAL proxy classes generation FrameworksJava agents INSTRUMENTATION .java .class source code byte code compilation loading AspectJ compiler ASM Javassist CGLibAspectJ BCEL Custom ClassLoader dropwizard metrics Perf4J byte code in runtime rt.jar lib/ext bootstrap extension classpath custom application long start = System.currentTimeInMillis(); // Your code goes here long finish = System.currentTimeInMillis(); System.out.println(start - finish);
  36. 36. 36CONFIDENTIAL > java –jar -agentlib:agent.dll app.jar> java –jar -agentlib:agent.jar app.jar JAVA AGENTS Use for deep dive into JVM • Has access to the JVM state, can receive JVMTI events • Independent from JVM (not interrupted by GC, can collect debug information between safe-points, etc.) API • JVMTI (C++ native interface of the JVM) Use for byte-code modification • Allows to transform byte-code before it is loaded by ClassLoader • Follows JVM lifecycle (suspended by GC, etc.) API • java.lang.instrument, java.lang.management Java C++
  37. 37. 37CONFIDENTIAL AGENT EXAMPLES HPROF Java profiler JDWP Java debugger JRebel/XRebel • https://zeroturnaround.com/software/jrebel/ -agentlib:hprof[=options] ToBeProfiledClass -agentlib:jdwp=transport=dt_socket,address=localhost:9009,server=y,suspend=y
  38. 38. 38CONFIDENTIAL public class DemoAgent() { public static void premain(String args, Instrumentation instr) { instr.addTransformer(new ClassLoadingLogger()); } } public class ClassLoadingLogger implements ClassFileTransformer { public byte[] transform(ClassLoader loader, String className, Class<?> classBeingRedefined, ProtectionDomain protectionDomain, byte[] classfileBuffer) throws IllegalClassFormatException { System.out.println(className); return classfileBuffer; } } Manifest-Version: 1.0 Agent-Class: com.example.DemoAgent Premain-Class: com.example.DemoAgent > java –jar –agentlib:agent.jar app.jar
  39. 39. 39CONFIDENTIAL public class DemoAgent() { public static void premain(String args, Instrumentation instr) { instr.addTransformer(new ClassLoadingLogger()); } } public class ClassLoadingLogger implements ClassFileTransformer { public byte[] transform(ClassLoader loader, String className, Class<?> classBeingRedefined, ProtectionDomain protectionDomain, byte[] classfileBuffer) throws IllegalClassFormatException { System.out.println(className); return classfileBuffer; } } Manifest-Version: 1.0 Agent-Class: com.example.DemoAgent Premain-Class: com.example.DemoAgent > java –jar –agentlib:agent.jar app.jar
  40. 40. 40CONFIDENTIAL public class DemoAgent() { public static void premain(String args, Instrumentation instr) { instr.addTransformer(new ClassLoadingLogger()); } } public class ClassLoadingLogger implements ClassFileTransformer { public byte[] transform(ClassLoader loader, String className, Class<?> classBeingRedefined, ProtectionDomain protectionDomain, byte[] classfileBuffer) throws IllegalClassFormatException { System.out.println(className); return classfileBuffer; } } Manifest-Version: 1.0 Agent-Class: com.example.DemoAgent Premain-Class: com.example.DemoAgent > java –jar –agentlib:agent.jar app.jar
  41. 41. 41CONFIDENTIAL public class DemoAgent() { public static void premain(String args, Instrumentation instr) { instr.addTransformer(new ClassLoadingLogger()); } } public class ClassLoadingLogger implements ClassFileTransformer { public byte[] transform(ClassLoader loader, String className, Class<?> classBeingRedefined, ProtectionDomain protectionDomain, byte[] classfileBuffer) throws IllegalClassFormatException { System.out.println(className); return classfileBuffer; } } Manifest-Version: 1.0 Agent-Class: com.example.DemoAgent Premain-Class: com.example.DemoAgent > java –jar –agentlib:agent.jar app.jar
  42. 42. 42CONFIDENTIAL JVM ClassLoader
  43. 43. 43CONFIDENTIAL JVM ClassLoader Agent 1. premain
  44. 44. 44CONFIDENTIAL JVM ClassLoader Agent ClassFile Transformer 1. premain 2. addTransformer
  45. 45. 45CONFIDENTIAL JVM ClassLoader Class A Class B Class C Agent ClassFile Transformer 1. premain 2. addTransformer 3. load class
  46. 46. 46CONFIDENTIAL JVM ClassLoader Class A Class B Class C Agent ClassFile Transformer 1. premain 2. addTransformer 3. load class 4. transform Class A
  47. 47. 47CONFIDENTIAL JVM ClassLoader Class A Class B Class C Agent ClassFile Transformer Byte code manipulation library 1. premain 2. addTransformer 3. load class 5. modify byte code 4. transform Class A
  48. 48. 48CONFIDENTIAL JVM ClassLoader Class A Class B Class C Agent ClassFile Transformer Byte code manipulation library 1. premain 2. addTransformer 3. load class 5. modify byte code 6. redefine class Class A* 4. transform Class A
  49. 49. 49CONFIDENTIAL JAVASSIST High-level, object-oriented API github.com/jboss-javassist/javassist
  50. 50. 50CONFIDENTIAL JAVASSIST github.com/jboss-javassist/javassist
  51. 51. 51CONFIDENTIAL JAVA AGENT + JAVASSIST LIVE DEMO https://github.com/kslisenko/java-performance/tree/master/java-agent
  52. 52. 52CONFIDENTIAL PROFILING DISTRIBUTED SYSTEM
  53. 53. 53CONFIDENTIAL DISTRIBUTED SYSTEM Server 1 DBServer 2 DBServer 3 HTTP HTTP HTTP
  54. 54. 54CONFIDENTIAL LOOKING GOOD Responses HTTP 200 150ms HTTP 200 150ms Server 1 DBServer 2 DBServer 3 HTTP HTTP HTTP
  55. 55. 55CONFIDENTIAL SOMETHING WENT WRONG Responses HTTP 200 150ms HTTP 200 270ms HTTP 200 270ms HTTP 200 150ms Server 1 DBServer 2 DBServer 3 HTTP HTTP HTTP
  56. 56. 56CONFIDENTIAL FAIL Server 1 DBServer 2 DBServer 3HTTP 500 timeout Responses HTTP HTTP HTTP HTTP 200 150ms HTTP 200 270ms HTTP 200 270ms HTTP 200 150ms Frustrated user
  57. 57. 57CONFIDENTIAL IDENTIFYING PERFORMANCE PROBLEM HTTP 500 timeout Responses HTTP 200 150ms HTTP 200 270ms HTTP 200 270ms HTTP 200 150ms Header req-id: 1 Header req-id: 1 Header req-id: 1 Server 1 DBServer 2 DBServer 3 HTTP HTTP HTTP Trace propagation Frustrated user
  58. 58. 58CONFIDENTIAL IDENTIFYING PERFORMANCE PROBLEM HTTP 500 timeout Responses HTTP 200 150ms HTTP 200 270ms HTTP 200 270ms HTTP 200 150ms Req-1 12:45:31.000 150 ms Req-1 12:45:31.010 130 ms Header req-id: 1 Header req-id: 1 Header req-id: 1 Req-1 12:45:31.020 120 ms Server 1 DBServer 2 DBServer 3 HTTP HTTP HTTP Trace propagation Frustrated user
  59. 59. 59CONFIDENTIAL TRACE EXAMPLE 1 http://server1/service http://server2/service server2 to DB business logic http://server3/service server3 to DB business logic 150 ms 120 ms 80 ms 30 ms 130 ms 100 ms 20 ms http://server1/service http://server2/service server2 to DB business logic http://server3/service server3 to DB business logic 120 ms 80 ms 30 ms 130 ms 100 ms 20 ms 270 ms HTTP 200 150ms HTTP 200 270ms
  60. 60. 60CONFIDENTIAL TRACE EXAMPLE 2 http://server1/service http://server2/service server2 to DB business logic http://server3/service server3 to DB business logic 150 ms 120 ms 80 ms 30 ms 130 ms 100 ms 20 ms http://server1/service http://server2/service server2 to DB business logic http://server3/service server3 to DB 120 ms 80 ms 30 ms 370 ms 350 ms 500 ms timeout HTTP 200 150ms HTTP 500 timeout
  61. 61. 61CONFIDENTIAL “When systems involve not just dozens of subsystems but dozens of engineering teams, even our best and most experienced engineers routinely guess wrong about the root cause of poor end-to-end performance.” Google Dapper https://research.google.com/pubs/pub36356.html
  62. 62. 62CONFIDENTIAL GOOGLE DAPPER Use cases 1. Identify performance problems across multiple teams and services 2. Build dynamic environment map Requirements 1. Low overhead – no impact on running services 2. Application-level transparency* – programmers should not need to be aware of the tracing system 3. Scalability *They instrumented Google Search almost without modifications
  63. 63. 63CONFIDENTIAL GOOGLE DAPPER: TRACES AND SPANS
  64. 64. 64CONFIDENTIAL GOOGLE DAPPER: ARTHITECTURE
  65. 65. 65CONFIDENTIAL GOOGLE DAPPER: TECHNICAL DETAILS Technical facts 1. Adaptive sampling 2. 1TB/day to BigTable 3. API + MapReduce 4. Instrumentation of common Google libraries Issues and limitations 1. Request buffering 2. Batch jobs 3. Queued requests 4. Relative latency
  66. 66. 66CONFIDENTIAL WANT LIKE IN GOOGLE?
  67. 67. 67CONFIDENTIAL COMMERCIAL Magic Quadrant for Application Performance Monitoring Suites (21 December 2016) OPEN-SOURCE Java Performance Monitoring: 5 Open Source Tools You Should Know (19 January 2017) www.stagemonitor.org github.com/naver/pinpoint www.moskito.org glowroot.org kamon.io zipkin.io https://www.gartner.com/doc/reprints?id=1-3OGTPY9&ct=161221 https://dzone.com/articles/java-performance- monitoring-5-open-source-tools-you-should-know
  68. 68. 68CONFIDENTIAL https://university.dynatrace.com/education/appmon/913/10859
  69. 69. 69CONFIDENTIAL ZIPKIN (SPRING CLOUD SLEUTH) Server 1 Server 2 HTTPHTTP transport storage User interface API http://zipkin.io/pages/architecture.html Instrumented libraries Send traces and spans Trace id Trace id
  70. 70. 70CONFIDENTIAL ZIPKIN (SPRING CLOUD SLEUTH) HTTP http://zipkin.io/pages/architecture.html Server 1 Server 2 HTTPHTTP transport storage User interface API Instrumented libraries Send traces and spans Trace id Trace id
  71. 71. 71CONFIDENTIAL ZIPKIN (SPRING CLOUD SLEUTH) HTTP http://zipkin.io/pages/architecture.html Instrumented libraries Server 1 Server 2 HTTPHTTP transport storage User interface API Send traces and spans Trace id Trace id
  72. 72. 72CONFIDENTIAL Backend DEMO APPLICATION Frontend Backend HTTP HTTP Demo cases 1. HTTP calls Spring boot browser 1 1 github.com/kslisenko/java-performance
  73. 73. 73CONFIDENTIAL Backend DEMO APPLICATION Frontend Backend HTTP JMS HTTP Demo cases 1. HTTP calls 2. JMS Spring boot chat queue JMS browser 1 1 2 github.com/kslisenko/java-performance
  74. 74. 74CONFIDENTIAL Backend DEMO APPLICATION Frontend Backend HTTP TCP/IP custom protocol JMS HTTP Demo cases 1. HTTP calls 2. JMS 3. Custom protocol (TCP/IP) Spring boot chat queue JMS browser 1 1 2 3 github.com/kslisenko/java-performance
  75. 75. 75CONFIDENTIAL Backend DEMO APPLICATION Frontend MySQL Backend HTTP TCP/IP custom protocol JMS HTTP Demo cases 1. HTTP calls 2. JMS 3. Custom protocol (TCP/IP) 4. DB, JDBC, Hibernate Spring boot chat queue JMS browser 1 1 2 3 4 github.com/kslisenko/java-performance
  76. 76. 76CONFIDENTIAL Backend DEMO APPLICATION Frontend MySQL Backend HTTP TCP/IP custom protocol JMS HTTP Demo cases 1. HTTP calls 2. JMS 3. Custom protocol (TCP/IP) 4. DB, JDBC, Hibernate 5. Exceptions 6. Async invocations – New threads – ExecutorService – CompletableFuture Spring boot chat queue JMS browser 51 1 2 3 4 6 github.com/kslisenko/java-performance
  77. 77. 77CONFIDENTIAL DYNATRACE + ZIPKIN LIVE DEMO github.com/kslisenko/java-performance
  78. 78. 78CONFIDENTIAL CONCLUSION 1. Make it work 2. Make it right 3. Make if fast
  79. 79. 79CONFIDENTIAL REFERENCES Metric libraries Perf4J https://github.com/perf4j/perf4j Metrics http://metrics.dropwizard.io Servo https://github.com/Netflix/servo Byte-code modification with JAVASSIST https://blog.newrelic.com/2014/09/29/diving-bytecode- manipulation-creating-audit-log-asm-javassist https://www.youtube.com/watch?v=39kdr1mNZ_s Java Agents https://www.slideshare.net/arhan/oredev-2015-taming-java- agents http://www.barcelonajug.org/2015/04/java-agents.html Profiling https://blog.codecentric.de/en/2011/10/measure-java- performance-sampling-or-instrumentation/ https://blog.codecentric.de/en/2014/10/profiler-tell-truth- javaone/ https://www.youtube.com/watch?v=YCC-CpTE2LU&t=2312s https://www.slideshare.net/aragozin/java-black-box-profiling https://www.slideshare.net/aragozin/java-profiling-diy- jugmskru-2016 Safe-points http://blog.ragozin.info/2012/10/safepoints-in-hotspot- jvm.html https://www.cberner.com/2015/05/24/debugging-jvm- safepoint-pauses/
  80. 80. 80CONFIDENTIAL QUESTIONS? THANK YOU! KANSTANTSIN_SLISENKA@EPAM.COM

×