3. What is CPU
profiling?
A profile is a set of statistics that
describes how often and for how long
various parts of the program executed.
See output sample.
4. How do
CPU
profilers
work?
● Most profilers run inside your Python process.
● If you’re inside a Python program you generally have
pretty easy access to its stack.
● There are two types of profilers that differ upon their
triggers:
○ Tracing profilers - triggered on function/line called
○ Sampling profilers - triggered on a time interval
5. How do
tracing
profilers
work?
● Python let you specify a callback that gets run when
various interpreter events (like “calling a function” or
“executing a line of code”) happen.
● When the callback gets called, it records the stack for
later analysis.
● You can set up that callback with:
○ PyEval_SetProfile - triggered only when a function is called
○ PyEval_SetTrace - triggered when a function is called or a
line of code is executed
● Cprofile uses PyEval_SetProfile
● line_profile uses PyEval_SetTrace
6. Disadvantag
e of tracing
profilers
● The main disadvantage of tracing profilers implemented
in this way is that they introduce a fixed amount of
latency for every function call / line of code executed.
● See example
● The documentation for cProfile says:
○ “The interpreted nature of Python tends to add so much
overhead to execution, that deterministic profiling tends to
only add small processing overhead in typical applications”
● Makes sense since standard programs does not have
so many function calls.
7. How do
sampling
profilers
work?
Well – let’s say you want to get a snapshot of a program’s
stack 50 times a second. A way to do that is:
● Ask the Linux kernel to send you a signal every 20
milliseconds (using the setitimer system call)
● Register a signal handler to record the stack every time
you get a signal.
● When you’re done profiling print the output!
8. Comparison ● A sample profile in 61 LOC
● A demo using it.
● A comparison of sampling vs. tracing
● Real Projects:
○ stacksampler
○ pyflame
○ python-flamegraph
12. Performance
consideratio
ns in design
● Consider performance at design time, not all
business/API requirement can be answered and there
might need to be some compromises need to be made.
● Do your best to understand the performance impact, but
no more (beware of analysis paralysis). Invest in a
testable, monitored environment instead.
13. Performance
monitoring in
vitro
After design and implementation we can check performance
using:
● CI and test suite:
○ Expose speed degradation
○ Have the base to run profiling on.
○ Notice! test fakers might have different data then the
actual data in production
● Saging environment:
○ Test scenarios on the same data as in production.
○ There might be some actions that will not know to do and
have significant performance issue in production
14. Performance
monitoring in
vivo
● Application Performance Monitoring (APM) such as
New Relic or Datadog allows to:
○ Set alerts on certain metrics
○ Analyze real transactions
○ Add custom instrumentation
● Users are the best QA 😱
17. Reference ● Juila Evans blog post -
https://jvns.ca/blog/2017/12/17/how-do-ruby---python-
profilers-work-/
● Nylas blog post -
https://www.nylas.com/blog/performance/
Notes de l'éditeur
Show:
Ncalls
Tottime
Cumtime
PyEval_SetTrace is similar to PyEval_SetProfile, except the tracing function does receive line-number events.