The document discusses performance tuning for Grails applications. It outlines that performance aspects include latency, throughput, and quality of operations. Performance tuning optimizes costs and ensures systems meet requirements under high load. Amdahl's law states that parallelization cannot speed up non-parallelizable tasks. The document recommends measuring and profiling, making single changes in iterations, and setting up feedback cycles for development and production environments. Common pitfalls in profiling Grails applications are also discussed.
2. Agenda
• What is performance and what are we
optimising for?
• How do you do performance tuning and
optimisation?
• common missteps, tips and tricks related to
Grails applications profiling and tuning
3. Performance aspects
• Latency of operations
• Throughput of operations
• Quality of operations - correctness, consistency,
resilience, security, usability ...
4. Why?
• Optimising costs to run your system -
operational efficiency
• Tuning your system to meet it's performance
requirements with optimal cost
• Performance is a feature of your system:
keeping up the quality of the operations under
high load
6. How?
• Set up your own feedback cycle for tuning your own system
• Measure & profile !
• start with the tools you have available. You can add more tools and methods in the
next iteration.
• Think & learn, analyse and plan the next change
• find tools and methods to measure something in the next iteration you want to
know about more
• Implement a single change!
• Iterate: do a lot of iterations and optimally change one thing at a time - this will help you to
learn about your system's performance and operational aspects
• Set up a different feedback cycle for production environments. Don't forget that
usually it's irrelevant if the system performs well on your laptop. If you are not involved in
operations, use innovative means to set up a feedback cycle.
8. More
• In concurrent execution: Amdahl's law - you won't be able to speed up a single computation
task if you cannot parallellize it.
• In traditional synchronous Grails code this means, that each request thread shouldn't
block other threads. It doesn't necessarily mean that you have to switch to asynchronous
handling of requests. However that might be helpful for error handling reasons.
Asynchronous doesn't mean fast.
• Find the most limiting bottleneck and eliminate it, one by one
• re-measure after each change because the behaviour of concurrent execution can be
different after a small change in reducing blocking - usually the next problem is not the 2.
one on the list from the previous measurement.
• "Mature optimization" - keep the clarity and consistency of the solution. Don't do things just
"because this is faster". Don't introduce accidental complexity.
• Find out also how your systems gets saturated - the saturation point . How does latency
change when load is added? Can your system survive? What happens when it gets over
loaded?
9. JVM code profiler concepts
• Sampling
• statistical ways to get information about the execution using JVM
profiling interfaces with a given time interval, for example 100
milliseconds. Statistical methods are used to calculate values based
on the samples.
• Unreliable results, but certainly useful in some cases since the
overhead of sampling is minimal compared to instrumentation
• Usually helps to get better understanding of the problem if you
learn to look past the numeric values returned from measurements.
• Instrumentation
• exact measurements of method execution details
10. Load generation tools
• Simple command line tools
• ab - apache bench
• wrk - modern HTTP benchmarking tool
• has lua scripting support for doing things like
checking the reply
• Load testing toolkits
• Support testing use cases and state full flows
11. Common pitfalls in profiling
Groovy and Grails code
• Measuring wall clock time
• Measuring CPU time for certain method
• Instrumentation usually provides false results
because of JIT compilation and other reasons
like spin locks
• lack of proper JVM warmup
• Relying on gut feeling
12. Ground your feet
• Find a way to review production performance graphs regularly,
especially after making changes to the system
• system utilisation over time (CPU load, IO load & wait, Memory
usage), system input workload (requests) over time, etc.
• In the Cloud, use tools like New Relic to get a view in operations
• CloudFoundry based Pivotal Web Services and IBM Bluemix
have New Relic available
• In the development environment, use a profiler and debugger to
get understanding. You can use grails-melody plugin to get
insight on SQL that's executed.
13. Recommendations
• Concentrate on eliminating blocking because of Amdahl's law
• Look for low hanging fruit (next slide) if you are in a rush - it's
worth doing.
• Concentrate on constantly improving the performance tuning
feedback cycles you have in place for development and
production environments.
• Innovate to get iterations going: you don't necessary need
expensive tools or toolkits. Continuous improvement is more
important than fancy tools.
• Take small steps.
14. Environment related
problems
• Improper JVM configuration for Grails apps
• out-of-the-box Tomcat parameters
• a single JVM running with a huge heap on a
big box
• If you have a big powerful box, it's better to
run multiple small JVMs and put a load
balancer in front of them
15. Low hanging fruit
• SQL and database related bottlenecks: learn how to profile SQL queries and tune your database queries and your database
• grails-melody plugin can be used to spot costly SQL queries in development and testing environments. Nothing
prevents use in production however there is a risk that running it in production environment has negative side effects.
• New Relic in CloudFoundry (works for production environments)
• Eliminate stack traces thrown in normal program flow - use profiler or debugger to find if any are called in normal program
flow
• High CPU usage: Check regexps that are used a lot (use profiler's CPU time measurement to spot those, search for the code
for candidate regexps). Also check the regexps with different input size. Make sure valid input doesn't trigger "catastrophic
backtracking". Understand what it is. Use a regexp analyser to find out the number of operations a certain input triggers in
handling the input.
• Check concurrency patterns like synchronised usage: using java.util.Hashtable/Properties is blocking
• these block: System.getProperty("some.config.value","some.default"), Boolean.getBoolean("some.feature.flag")
• .each -> for loop
• Cache implementation that serves stale information while entry is being updated (blocks only when there isn't information
available)
• Cache implementation that locks a certain key for updating to prevent cache storms
• "static transactional = true" in services that don't need transactions
16. Tool for getting insight in sudden
production performance problems
• kill -3 <PID>
• Makes a thread dump of all threads and outputs
it to System.out which ends up in catalina.out in
default Tomcat config.
17. wrk http load testing tool
sample output
1 Running 10s test @ http://localhost:8080/empty-test-app/empty/index
2 10 threads and 10 connections
3 Thread Stats Avg Stdev Max +/- Stdev
4 Latency 1.46ms 4.24ms 17.41ms 93.28%
5 Req/Sec 2.93k 0.90k 5.11k 85.67%
6 Latency Distribution
7 50% 320.00us
8 75% 352.00us
9 90% 406.00us
10 99% 17.34ms
11 249573 requests in 10.00s, 41.22MB read
12 Socket errors: connect 1, read 0, write 0, timeout 5
13 Requests/sec: 24949.26
14 Transfer/sec: 4.12MB
check latency, the max and
it's distribution
Total throughput