With cloud computing we can add more hardware resources on the fly. Considering how expensive load and stress testing can be, why don't we just add more power when needed?
This presentation will explain why, especially for situations where cloud computing is available, load and stress testing often falls short but is still required. It will also show how the queuing theory can provide a different approach which allows load and stress testers to add real value. Stakeholders and test managers can use the same theory to get a handle on the coverage and depth of the tests.
Key Takeaways:
- Why performance testing so often fails to accomplish what we want
- Why relying on cloud computing alone is not enough
- How the queuing theory can provide a different approach to performance testing
- How the queuing theory can help you understand if the performance tests
www.eurostarconferences.com
www.testhuddle.com
2. You just woke up after a 10 years nap:
Team member:
“We can add extra processing power and memory on the fly.
An extra database has a lead time of two weeks”
3. Does this sound familiar:
Performance test: everything OK
Day 1 on production: we end up adding more than four times
the hardware
4. 1. the tools simulate but are not quite equal
2. load profiles are based on too many assumptions
3. we report more accurately than we can measure
4. long setup time → limited amount of tests
5. we hide it all in complex reports
5. We send and accept the same requests and responses but
can't anticipate slight changes
In production, a lot more is going on than just our test
Did we really get a good response
Similar hardware is expensive
6. Cloud computing: adding extra hardware can be done on the
fly and on a moments notice
With the high costs of performance testing and how easy we
can 'speed things up' if needed:
Why bother testing? The money is better spent on that extra
hardware
7. Just start with an overkill of hardware and scale down to what is
actually used!
8.
9.
10. Computers are running or idling.
The queuing theory is an established model for performance
engineers
It can describe the behavior of systems on every layer
11.
12.
13.
14. Queuing center: a location in our system where waiting
(queuing) occurs a Bottleneck if you will
◦ They can exist anywhere: CPU, Memory, Network, IO, other systems
◦ There is always one or more queuing centers
◦ A queuing center really determines the performance
◦ The queuing center provide key information on scalability
◦ Service and wait time are the real components of performance
15. Queuing model describe anything: large connected systems,
small, embedded ...
You can 'zoom in' and the model can describe the behavior or
the server
You can keep zooming in to CPU, network etc.
16. Multiple zoom levels
Residence time = wait + service time
There is always a queuing center
No queuing center found: look harder
17. Cloud computing not infinite:
Financial limit
Technical: IO/Network/CPU speed per process
We don't build supercomputers to calculate a mortgage offer
18. Always find the queuing centers
Based on the result: judge 'yes we are likely to meet
requirement X Y and Z'
Show where the risks are 'requirement x cannot be feasibly
met for function y'
Explore the risks
19. Explore identified resource heavy components with
stakeholders, developers and oracles
◦ Other use of this component?
◦ Real frequency of usage?
◦ Validity of the (generic) requirement for this function?
Place the results in context:
◦ You may have a bigger issue than you thought
◦ Or it is actually OK for this usage
20. Define a set of key functions/use cases with stakeholders and
experts (i.e. functional testers)
Per test identify at least one queuing center
Compare with generic requirements
◦ Can meet ?
◦ Risk exist → explore → place in context →define further test
The model allows you to place real behavior in context and a
realistic assessment of risk
21. If no queuing center was found → monitoring was not
sufficient
Queuing centers:
◦ Tell you about the risks to core functionality: performance and
financial
◦ Tell you on the ability to scale
◦ Improve response time in scaling up
22. Stakeholders don't (necessarily) understand queuing models
Explain in what matters to them: i.e. when making the offer it
takes 15 seconds to generate
Think of the systems as queuing systems and explain
behavior
23. Knowing what the behavior is can tell you:
◦ if you can handle requirements
◦ how to scale if needed
◦ estimate if performance can be met within budget
◦ if you need to adapt your cloud (i.e. improve IO/network, CPU)
So yes: it still makes sense to do performance testing
24. Batch process tested to be run from multiple servers
Process needed to be faster
Risk: 'on-line' processes on server should not be impacted
Finding: 3 servers, three times as fast. But no queuing center
found???
Deep diving in CPU monitoring showed the queuing center:
Process was pausing/waiting after each cycle
Conclusion: → on-line processes not impacted as there was
sufficient CPU time for other processes
25. Stress point found
Unclear where queuing center was
Cause: JAVA memory management can be deceiving on OS
level.
Rule that the queuing center needed to be found made us find
out. The absence of a queuing center makes you look further