SlideShare une entreprise Scribd logo
1  sur  44
Télécharger pour lire hors ligne
Brendan Gregg
Senior Performance Architect
27 Feb 2017
image:	h)p://makeitstranger.com
Observability
Best Possible Performance
&
Root Cause Analysis
Needed:
Observe Everything
In Production
Quickly
Enhanced BPF is in Linux
Linux	4.3	
Linux	4.7	 Linux	4.9	
Linux	4.9	
Linux	4.1	
BPF	stacks	
Linux	4.6	
BPF	output	
Linux	4.4	
Version	BPF	support	arrived
How do we
use these
superpowers?
Off-CPU Analysis
Thread State Analysis
e.t.c.
Pose Q's for tools to A
Methodologies
Current Tools
bcc: BPF Compiler Collection
https://github.com/iovisor/bcc
Single Purpose Tools
Multi-Tools
Single purpose vs Multi-tools
# opensnoop
PID COMM FD ERR PATH
10085 sshd 3 0 /lib/x86_64-linux-gnu/libresolv.so.2
10085 sshd 3 0 /lib/x86_64-linux-gnu/libgpg-error.so.0
10085 sshd 3 0 /dev/urandom
10085 sshd -1 2 /lib/x86_64-linux-gnu/.libcrypto.so.1.0.0.hmac
10085 sshd -1 2 /proc/sys/crypto/fips_enabled
# trace 'do_sys_open "%s", arg2' 'r::do_sys_open "ret:%d", retval'
PID TID COMM FUNC -
1651 1651 redis-server do_sys_open /proc/1651/stat
1968 1968 redis-server do_sys_open /proc/1968/stat
1651 1651 redis-server do_sys_open ret:5
1968 1968 redis-server do_sys_open ret:5
2218 2218 snmp-pass do_sys_open /proc/cpuinfo
2218 2218 snmp-pass do_sys_open ret:4
2218 2218 snmp-pass do_sys_open /proc/stat
2218 2218 snmp-pass do_sys_open ret:4
Single purpose tool usage
# biolatency -h
usage: biolatency [-h] [-T] [-Q] [-m] [-D] [interval] [count]
Summarize block device I/O latency as a histogram
positional arguments:
interval output interval, in seconds
count number of outputs
optional arguments:
-h, --help show this help message and exit
-T, --timestamp include timestamp on output
-Q, --queued include OS queued time in I/O time
-m, --milliseconds millisecond histogram
-D, --disks print a histogram per disk device
examples:
./biolatency # summarize block I/O latency as a histogram
[...]
CLI
Tool Design
Template 1: Per Event Output
# opensnoop
PID COMM FD ERR PATH
10085 sshd 3 0 /lib/x86_64-linux-gnu/libkeyutils.so.1
10085 sshd 3 0 /lib/x86_64-linux-gnu/libresolv.so.2
10085 sshd 3 0 /lib/x86_64-linux-gnu/libgpg-error.so.0
10085 sshd 3 0 /dev/urandom
10085 sshd -1 2 /lib/x86_64-linux-gnu/.libcrypto.so.1.0.0.hmac
10085 sshd -1 2 /proc/sys/crypto/fips_enabled
10085 sshd 3 0 /proc/filesystems
10085 sshd 3 0 /dev/null
10085 sshd 3 0 /proc/10085/fd
10085 sshd 3 0 /usr/lib/ssl/openssl.cnf
10085 sshd 3 0 /etc/gai.conf
10085 sshd 3 0 /etc/nsswitch.conf
10085 sshd 3 0 /etc/ld.so.cache
10085 sshd 3 0 /lib/x86_64-linux-gnu/libnss_compat.so.2
10085 sshd 3 0 /etc/ld.so.cache
10085 sshd 3 0 /lib/x86_64-linux-gnu/libnss_nis.so.2
[…]
Template 2: Filtered Event Output
# ext4slower 1
Tracing ext4 operations slower than 1 ms
TIME COMM PID T BYTES OFF_KB LAT(ms) FILENAME
06:49:17 bash 3616 R 128 0 7.75 cksum
06:49:17 cksum 3616 R 39552 0 1.34 [
06:49:17 cksum 3616 R 96 0 5.36 2to3-2.7
06:49:17 cksum 3616 R 96 0 14.94 2to3-3.4
06:49:17 cksum 3616 R 10320 0 6.82 411toppm
06:49:17 cksum 3616 R 65536 0 4.01 a2p
06:49:17 cksum 3616 R 55400 0 8.77 ab
06:49:17 cksum 3616 R 36792 0 16.34 aclocal-1.14
06:49:17 cksum 3616 R 15008 0 19.31 acpi_listen
06:49:17 cksum 3616 R 6123 0 17.23 add-apt-repository
06:49:17 cksum 3616 R 6280 0 18.40 addpart
06:49:17 cksum 3616 R 27696 0 2.16 addr2line
06:49:17 cksum 3616 R 58080 0 10.11 ag
06:49:17 cksum 3616 R 906 0 6.30 ec2-meta-data
06:49:17 cksum 3616 R 6320 0 10.00 animate.im6
[…]
Template 3: Interval Summary
# dcstat
TIME REFS/s SLOW/s MISS/s HIT%
08:11:47: 2059 141 97 95.29
08:11:48: 79974 151 106 99.87
08:11:49: 192874 146 102 99.95
08:11:50: 2051 144 100 95.12
08:11:51: 73373 17239 17194 76.57
08:11:52: 54685 25431 25387 53.58
08:11:53: 18127 8182 8137 55.12
08:11:54: 22517 10345 10301 54.25
08:11:55: 7524 2881 2836 62.31
08:11:56: 2067 141 97 95.31
08:11:57: 2115 145 101 95.22
[…]
Template 4: Count Summary
# funccount 'vfs_*'
Tracing... Ctrl-C to end.
^C
ADDR FUNC COUNT
ffffffff811efe81 vfs_create 1
ffffffff811f24a1 vfs_rename 1
ffffffff81215191 vfs_fsync_range 2
ffffffff81231df1 vfs_lock_file 30
ffffffff811e8dd1 vfs_fstatat 152
ffffffff811e8d71 vfs_fstat 154
ffffffff811e4381 vfs_write 166
ffffffff811e8c71 vfs_getattr_nosec 262
ffffffff811e8d41 vfs_getattr 262
ffffffff811e3221 vfs_open 264
ffffffff811e4251 vfs_read 470
Detaching...
Template 5: Histogram Summary
# biolatency
Tracing block device I/O... Hit Ctrl-C to end.
^C
usecs : count distribution
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 1 | |
128 -> 255 : 12 |******** |
256 -> 511 : 15 |********** |
512 -> 1023 : 43 |******************************* |
1024 -> 2047 : 52 |**************************************|
2048 -> 4095 : 47 |********************************** |
4096 -> 8191 : 52 |**************************************|
8192 -> 16383 : 36 |************************** |
16384 -> 32767 : 15 |********** |
32768 -> 65535 : 2 |* |
65536 -> 131071 : 2 |* |
Template 6: Heatmap Summary
Template 7: Folded stack output for flame graphs
offcputime -f
offwaketime -f
wakeuptime -f
profile -f
| flamegraph.pl
> out.svg
Valuable
Know what already exists
and what doesn't
Documented
code comments
man pages
example files
Concise, intuitive
self-explanatory
# iolatency
Tracing block I/O. Output every 1 seconds. Ctrl-C to end.
>=(ms) .. <(ms) : I/O |Distribution |
0 -> 1 : 4381 |######################################|
1 -> 2 : 9 |# |
2 -> 4 : 5 |# |
4 -> 8 : 0 | |
8 -> 16 : 1 |# |
[…]
# ./biolatency -h
usage: biolatency [-h] [-T] [-Q] [-m] [-D] [interval] [count]
Summarize block device I/O latency as a histogram
positional arguments:
interval output interval, in seconds
count number of outputs
optional arguments:
-h, --help show this help message and exit
-T, --timestamp include timestamp on output
-Q, --queued include OS queued time in I/O time
-m, --milliseconds millisecond histogram
-D, --disks print a histogram per disk device
examples:
./biolatency # summarize block I/O latency as a histogram
./biolatency 1 10 # print 1 second summaries, 10 times
./biolatency -mT 1 # 1s summaries, milliseconds, and timestamps
./biolatency -Q # include OS queued time in I/O time
./biolatency -D # show each disk device separately
.POSIX-style.
arguments
Op>on	 Alternate	 Expecta>on	
-a	 --all	 all	events	
-c	CMD	 --cmd	…	 run	this	command	
-d	SECONDS	 --duraAon	…	 duraAon	of	tool	execuAon	
-h	 --help	 help	
-i	FILE	 --input	…	 input	file	
-i	SECONDS	 --interval	…	 summary	interval	
-n	name	 --name	…	 this	process	name	only	
-o	FILE	 --output	…	 output	file	
-p	PID	 --pid	…	 this	process	ID	only	
-P	 --by-process	 per-process	ID	breakdown	
-P	PORT	 --port	…	 this	TCP	port	only	
-t	or	-T	 --[no]Amestamp	 include	or	exclude	Amestamps	
-v	 --verbose	 verbose	output	
-x	 --extended,	--errors	 extended	output,	or	only	failures	
[interval	[count]]	 -	 summary	interval,	and	#	of	outputs
Tested
If you can't write the workload,
you can't write the tool
Future
Challenges
State of BPF, Feb 2017
1.  Dynamic	tracing,	kernel-level	(BPF	support	for	kprobes)	
2.  Dynamic	tracing,	user-level	(BPF	support	for	uprobes)	
3.  StaAc	tracing,	kernel-level	(BPF	support	for	tracepoints)	
4.  Timed	sampling	events	(BPF	with	perf_event_open)	
5.  PMC	events	(BPF	with	perf_event_open)	
6.  Filtering	(via	BPF	programs)	
7.  Debug	output	(bpf_trace_printk())	
8.  Per-event	output	(bpf_perf_event_output())	
9.  Basic	variables	(global	&	per-thread	variables,	via	BPF	maps)	
10.  AssociaAve	arrays	(via	BPF	maps)	
11.  Frequency	counAng	(via	BPF	maps)	
12.  Histograms	(power-of-2,	linear,	and	custom,	via	BPF	maps)	
13.  Timestamps	and	Ame	deltas	(bpf_kAme_get_()	and	BPF)	
14.  Stack	traces,	kernel	(BPF	stackmap)	
15.  Stack	traces,	user	(BPF	stackmap)	
16.  Overwrite	ring	buffers	
17.  String	factory	(stringmap)	
18.  OpAonal:	bounded	loops,	<	and	<=,	…	
1.  StaAc	tracing,	user-level	(USDT	probes	via	uprobes)	
2.  StaAc	tracing,	dynamic	USDT	(needs	library	support)	
3.  Debug	output	(Python	with	BPF.trace_pipe()	and	
BPF.trace_fields())	
4.  Per-event	output	(BPF_PERF_OUTPUT	macro	and	
BPF.open_perf_buffer())	
5.  Interval	output	(BPF.get_table()	and	table.clear())	
6.  Histogram	prinAng	(table.print_log2_hist())	
7.  C	struct	navigaAon,	kernel-level	(maps	to	bpf_probe_read())	
8.  Symbol	resoluAon,	kernel-level	(ksym(),	ksymaddr())	
9.  Symbol	resoluAon,	user-level	(usymaddr())	
10.  BPF	tracepoint	support	(via	TRACEPOINT_PROBE)	
11.  BPF	stack	trace	support	(incl.	walk	method	for	stack	frames)	
12.  Examples	(under	/examples)	
13.  Many	tools	(/tools)	
14.  Tutorials	(/docs/tutorial*.md)	
15.  Reference	guide	(/docs/reference_guide.md)	
16.  Open	issues:	(h)ps://github.com/iovisor/bcc/issues)	
State	of	bcc,	Feb	2017	
done	
not	yet
Dynamic tracing stability
need those smoke tests
switch tools to static tracepoints
Invalid Tools
Overhead
Especially current uprobes
Ease of Coding
bcc/BPF
bcc	examples/tracing/bitehist.py	
en>re	program
ply/BPF
h)ps://github.com/wkz/ply/blob/master/README.md	
en>re	program
Visualizations
Visualizations and GUIs
Flame Graphs
Tracing Reports
…
Eg,	Nejlix	self-service	UI:
Ancient Linux
Linux 3.18
Linux 3.10
Linux 3.2
Linux 2.6.x
(Some) More Tools
Finish porting my old DTrace tools
Links & References
iovisor bcc:
•  https://github.com/iovisor/bcc
•  https://github.com/iovisor/bcc/tree/master/docs
•  http://www.brendangregg.com/blog/ (search for "bcc")
•  http://blogs.microsoft.co.il/sasha/2016/02/14/two-new-ebpf-tools-memleak-and-argdist/
•  I'll change your view of Linux tracing: https://www.youtube.com/watch?v=GsMs3n8CB6g
•  On designing tracing tools: https://www.youtube.com/watch?v=uibLwoVKjec
BPF:
•  https://www.kernel.org/doc/Documentation/networking/filter.txt
•  https://github.com/iovisor/bpf-docs
•  https://suchakra.wordpress.com/tag/bpf/
Flame Graphs:
•  http://www.brendangregg.com/flamegraphs.html
•  http://www.brendangregg.com/blog/2016-01-20/ebpf-offcpu-flame-graph.html
•  http://www.brendangregg.com/blog/2016-02-01/linux-wakeup-offwake-profiling.html
Linux Performance: http://www.brendangregg.com/linuxperf.html
Thanks
Discussion?
iovisor bcc: https://github.com/iovisor/bcc
http://www.brendangregg.com
http://slideshare.net/brendangregg
bgregg@netflix.com
@brendangregg
Thanks to Alexei Starovoitov (Facebook), Brenden Blanco
(PLUMgrid/VMware), Sasha Goldshtein (Sela), Daniel
Borkmann (Cisco), Wang Nan (Huawei), and other BPF
and bcc contributors!

Contenu connexe

Tendances

BPF: Tracing and more
BPF: Tracing and moreBPF: Tracing and more
BPF: Tracing and more
Brendan Gregg
 
Linux kernel-rootkit-dev - Wonokaerun
Linux kernel-rootkit-dev - WonokaerunLinux kernel-rootkit-dev - Wonokaerun
Linux kernel-rootkit-dev - Wonokaerun
idsecconf
 
Kernel Recipes 2017: Performance Analysis with BPF
Kernel Recipes 2017: Performance Analysis with BPFKernel Recipes 2017: Performance Analysis with BPF
Kernel Recipes 2017: Performance Analysis with BPF
Brendan Gregg
 
USENIX ATC 2017 Performance Superpowers with Enhanced BPF
USENIX ATC 2017 Performance Superpowers with Enhanced BPFUSENIX ATC 2017 Performance Superpowers with Enhanced BPF
USENIX ATC 2017 Performance Superpowers with Enhanced BPF
Brendan Gregg
 

Tendances (20)

Meet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracingMeet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracing
 
Performance Wins with BPF: Getting Started
Performance Wins with BPF: Getting StartedPerformance Wins with BPF: Getting Started
Performance Wins with BPF: Getting Started
 
BPF: Tracing and more
BPF: Tracing and moreBPF: Tracing and more
BPF: Tracing and more
 
Performance Analysis Tools for Linux Kernel
Performance Analysis Tools for Linux KernelPerformance Analysis Tools for Linux Kernel
Performance Analysis Tools for Linux Kernel
 
BPF Internals (eBPF)
BPF Internals (eBPF)BPF Internals (eBPF)
BPF Internals (eBPF)
 
Security Monitoring with eBPF
Security Monitoring with eBPFSecurity Monitoring with eBPF
Security Monitoring with eBPF
 
Linux kernel-rootkit-dev - Wonokaerun
Linux kernel-rootkit-dev - WonokaerunLinux kernel-rootkit-dev - Wonokaerun
Linux kernel-rootkit-dev - Wonokaerun
 
Container Performance Analysis
Container Performance AnalysisContainer Performance Analysis
Container Performance Analysis
 
Tracing MariaDB server with bpftrace - MariaDB Server Fest 2021
Tracing MariaDB server with bpftrace - MariaDB Server Fest 2021Tracing MariaDB server with bpftrace - MariaDB Server Fest 2021
Tracing MariaDB server with bpftrace - MariaDB Server Fest 2021
 
LSFMM 2019 BPF Observability
LSFMM 2019 BPF ObservabilityLSFMM 2019 BPF Observability
LSFMM 2019 BPF Observability
 
re:Invent 2019 BPF Performance Analysis at Netflix
re:Invent 2019 BPF Performance Analysis at Netflixre:Invent 2019 BPF Performance Analysis at Netflix
re:Invent 2019 BPF Performance Analysis at Netflix
 
Kernel Recipes 2017: Performance Analysis with BPF
Kernel Recipes 2017: Performance Analysis with BPFKernel Recipes 2017: Performance Analysis with BPF
Kernel Recipes 2017: Performance Analysis with BPF
 
USENIX ATC 2017 Performance Superpowers with Enhanced BPF
USENIX ATC 2017 Performance Superpowers with Enhanced BPFUSENIX ATC 2017 Performance Superpowers with Enhanced BPF
USENIX ATC 2017 Performance Superpowers with Enhanced BPF
 
Spying on the Linux kernel for fun and profit
Spying on the Linux kernel for fun and profitSpying on the Linux kernel for fun and profit
Spying on the Linux kernel for fun and profit
 
Performance Wins with eBPF: Getting Started (2021)
Performance Wins with eBPF: Getting Started (2021)Performance Wins with eBPF: Getting Started (2021)
Performance Wins with eBPF: Getting Started (2021)
 
Linux Performance Profiling and Monitoring
Linux Performance Profiling and MonitoringLinux Performance Profiling and Monitoring
Linux Performance Profiling and Monitoring
 
eBPF Perf Tools 2019
eBPF Perf Tools 2019eBPF Perf Tools 2019
eBPF Perf Tools 2019
 
Introduction to eBPF and XDP
Introduction to eBPF and XDPIntroduction to eBPF and XDP
Introduction to eBPF and XDP
 
ATO Linux Performance 2018
ATO Linux Performance 2018ATO Linux Performance 2018
ATO Linux Performance 2018
 
OSSNA 2017 Performance Analysis Superpowers with Linux BPF
OSSNA 2017 Performance Analysis Superpowers with Linux BPFOSSNA 2017 Performance Analysis Superpowers with Linux BPF
OSSNA 2017 Performance Analysis Superpowers with Linux BPF
 

En vedette

CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016] CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016]
IO Visor Project
 
email intruction style guide
email intruction style guideemail intruction style guide
email intruction style guide
cameron thomas
 

En vedette (20)

IO Visor Summit 2017: Welcome & Overview via Pere Monclus
IO Visor Summit 2017: Welcome & Overview via Pere MonclusIO Visor Summit 2017: Welcome & Overview via Pere Monclus
IO Visor Summit 2017: Welcome & Overview via Pere Monclus
 
Using IO Visor to Secure Microservices Running on CloudFoundry [OpenStack Sum...
Using IO Visor to Secure Microservices Running on CloudFoundry [OpenStack Sum...Using IO Visor to Secure Microservices Running on CloudFoundry [OpenStack Sum...
Using IO Visor to Secure Microservices Running on CloudFoundry [OpenStack Sum...
 
CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016] CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016]
 
An Updated Performance Comparison of Virtual Machines and Linux Containers
An Updated Performance Comparison of Virtual Machines and Linux ContainersAn Updated Performance Comparison of Virtual Machines and Linux Containers
An Updated Performance Comparison of Virtual Machines and Linux Containers
 
Sensu Monitoring
Sensu MonitoringSensu Monitoring
Sensu Monitoring
 
email intruction style guide
email intruction style guideemail intruction style guide
email intruction style guide
 
Demystifying datastores
Demystifying datastoresDemystifying datastores
Demystifying datastores
 
Introduction to concurrent programming with akka actors
Introduction to concurrent programming with akka actorsIntroduction to concurrent programming with akka actors
Introduction to concurrent programming with akka actors
 
Redis overview
Redis overviewRedis overview
Redis overview
 
Evolving Virtual Networking with IO Visor [OpenStack Summit Austin | April 2016]
Evolving Virtual Networking with IO Visor [OpenStack Summit Austin | April 2016]Evolving Virtual Networking with IO Visor [OpenStack Summit Austin | April 2016]
Evolving Virtual Networking with IO Visor [OpenStack Summit Austin | April 2016]
 
Golang 101 (Concurrency vs Parallelism)
Golang 101 (Concurrency vs Parallelism)Golang 101 (Concurrency vs Parallelism)
Golang 101 (Concurrency vs Parallelism)
 
Common problems with email
Common problems with emailCommon problems with email
Common problems with email
 
Introduction of Mesosphere DCOS
Introduction of Mesosphere DCOSIntroduction of Mesosphere DCOS
Introduction of Mesosphere DCOS
 
Сергей Радзыняк ".NET Microservices in Real Life"
Сергей Радзыняк ".NET Microservices in Real Life"Сергей Радзыняк ".NET Microservices in Real Life"
Сергей Радзыняк ".NET Microservices in Real Life"
 
CloudConf2017 - Deploy, Scale & Coordinate a microservice oriented application
CloudConf2017 - Deploy, Scale & Coordinate a microservice oriented applicationCloudConf2017 - Deploy, Scale & Coordinate a microservice oriented application
CloudConf2017 - Deploy, Scale & Coordinate a microservice oriented application
 
Introduction To Anypoint CloudHub With Mulesoft
Introduction To Anypoint CloudHub With MulesoftIntroduction To Anypoint CloudHub With Mulesoft
Introduction To Anypoint CloudHub With Mulesoft
 
Redis原生命令介绍
Redis原生命令介绍Redis原生命令介绍
Redis原生命令介绍
 
In out system
In out systemIn out system
In out system
 
The end of embedded Linux (as we know it)
The end of embedded Linux (as we know it)The end of embedded Linux (as we know it)
The end of embedded Linux (as we know it)
 
Running .NET on Docker
Running .NET on DockerRunning .NET on Docker
Running .NET on Docker
 

Similaire à bcc/BPF tools - Strategy, current tools, future challenges

Kernel Recipes 2017 - Performance analysis Superpowers with Linux BPF - Brend...
Kernel Recipes 2017 - Performance analysis Superpowers with Linux BPF - Brend...Kernel Recipes 2017 - Performance analysis Superpowers with Linux BPF - Brend...
Kernel Recipes 2017 - Performance analysis Superpowers with Linux BPF - Brend...
Anne Nicolas
 
HKG18-TR14 - Postmortem Debugging with Coresight
HKG18-TR14 - Postmortem Debugging with CoresightHKG18-TR14 - Postmortem Debugging with Coresight
HKG18-TR14 - Postmortem Debugging with Coresight
Linaro
 
InstructionsInstructions for numberguessernumberGuesser.html.docx
InstructionsInstructions for numberguessernumberGuesser.html.docxInstructionsInstructions for numberguessernumberGuesser.html.docx
InstructionsInstructions for numberguessernumberGuesser.html.docx
dirkrplav
 
Crash_Report_Mechanism_In_Tizen
Crash_Report_Mechanism_In_TizenCrash_Report_Mechanism_In_Tizen
Crash_Report_Mechanism_In_Tizen
Lex Yu
 

Similaire à bcc/BPF tools - Strategy, current tools, future challenges (20)

Kernel Recipes 2017 - Performance analysis Superpowers with Linux BPF - Brend...
Kernel Recipes 2017 - Performance analysis Superpowers with Linux BPF - Brend...Kernel Recipes 2017 - Performance analysis Superpowers with Linux BPF - Brend...
Kernel Recipes 2017 - Performance analysis Superpowers with Linux BPF - Brend...
 
Designing Tracing Tools
Designing Tracing ToolsDesigning Tracing Tools
Designing Tracing Tools
 
Designing Tracing Tools
Designing Tracing ToolsDesigning Tracing Tools
Designing Tracing Tools
 
Linux Capabilities - eng - v2.1.5, compact
Linux Capabilities - eng - v2.1.5, compactLinux Capabilities - eng - v2.1.5, compact
Linux Capabilities - eng - v2.1.5, compact
 
C&C Botnet Factory
C&C Botnet FactoryC&C Botnet Factory
C&C Botnet Factory
 
SOFA Tutorial
SOFA TutorialSOFA Tutorial
SOFA Tutorial
 
Reverse engineering Swisscom's Centro Grande Modem
Reverse engineering Swisscom's Centro Grande ModemReverse engineering Swisscom's Centro Grande Modem
Reverse engineering Swisscom's Centro Grande Modem
 
HKG18-TR14 - Postmortem Debugging with Coresight
HKG18-TR14 - Postmortem Debugging with CoresightHKG18-TR14 - Postmortem Debugging with Coresight
HKG18-TR14 - Postmortem Debugging with Coresight
 
OSMC 2015: Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015: Linux Performance Profiling and Monitoring by Werner FischerOSMC 2015: Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015: Linux Performance Profiling and Monitoring by Werner Fischer
 
OSMC 2015 | Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015 | Linux Performance Profiling and Monitoring by Werner FischerOSMC 2015 | Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015 | Linux Performance Profiling and Monitoring by Werner Fischer
 
InstructionsInstructions for numberguessernumberGuesser.html.docx
InstructionsInstructions for numberguessernumberGuesser.html.docxInstructionsInstructions for numberguessernumberGuesser.html.docx
InstructionsInstructions for numberguessernumberGuesser.html.docx
 
OSDC 2017 - Werner Fischer - Linux performance profiling and monitoring
OSDC 2017 - Werner Fischer - Linux performance profiling and monitoringOSDC 2017 - Werner Fischer - Linux performance profiling and monitoring
OSDC 2017 - Werner Fischer - Linux performance profiling and monitoring
 
Modern Linux Tracing Landscape
Modern Linux Tracing LandscapeModern Linux Tracing Landscape
Modern Linux Tracing Landscape
 
Debugging Ruby
Debugging RubyDebugging Ruby
Debugging Ruby
 
Crash_Report_Mechanism_In_Tizen
Crash_Report_Mechanism_In_TizenCrash_Report_Mechanism_In_Tizen
Crash_Report_Mechanism_In_Tizen
 
Processes And Job Control
Processes And Job ControlProcesses And Job Control
Processes And Job Control
 
The New Systems Performance
The New Systems PerformanceThe New Systems Performance
The New Systems Performance
 
test
testtest
test
 
LISA2019 Linux Systems Performance
LISA2019 Linux Systems PerformanceLISA2019 Linux Systems Performance
LISA2019 Linux Systems Performance
 
Network Adapter Deep dive
Network Adapter Deep diveNetwork Adapter Deep dive
Network Adapter Deep dive
 

Dernier

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Dernier (20)

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 

bcc/BPF tools - Strategy, current tools, future challenges

  • 1. Brendan Gregg Senior Performance Architect 27 Feb 2017 image: h)p://makeitstranger.com
  • 5.
  • 6. Enhanced BPF is in Linux
  • 8. How do we use these superpowers?
  • 9. Off-CPU Analysis Thread State Analysis e.t.c. Pose Q's for tools to A Methodologies
  • 11. bcc: BPF Compiler Collection https://github.com/iovisor/bcc
  • 12.
  • 14. Single purpose vs Multi-tools # opensnoop PID COMM FD ERR PATH 10085 sshd 3 0 /lib/x86_64-linux-gnu/libresolv.so.2 10085 sshd 3 0 /lib/x86_64-linux-gnu/libgpg-error.so.0 10085 sshd 3 0 /dev/urandom 10085 sshd -1 2 /lib/x86_64-linux-gnu/.libcrypto.so.1.0.0.hmac 10085 sshd -1 2 /proc/sys/crypto/fips_enabled # trace 'do_sys_open "%s", arg2' 'r::do_sys_open "ret:%d", retval' PID TID COMM FUNC - 1651 1651 redis-server do_sys_open /proc/1651/stat 1968 1968 redis-server do_sys_open /proc/1968/stat 1651 1651 redis-server do_sys_open ret:5 1968 1968 redis-server do_sys_open ret:5 2218 2218 snmp-pass do_sys_open /proc/cpuinfo 2218 2218 snmp-pass do_sys_open ret:4 2218 2218 snmp-pass do_sys_open /proc/stat 2218 2218 snmp-pass do_sys_open ret:4
  • 15. Single purpose tool usage # biolatency -h usage: biolatency [-h] [-T] [-Q] [-m] [-D] [interval] [count] Summarize block device I/O latency as a histogram positional arguments: interval output interval, in seconds count number of outputs optional arguments: -h, --help show this help message and exit -T, --timestamp include timestamp on output -Q, --queued include OS queued time in I/O time -m, --milliseconds millisecond histogram -D, --disks print a histogram per disk device examples: ./biolatency # summarize block I/O latency as a histogram [...]
  • 17. Template 1: Per Event Output # opensnoop PID COMM FD ERR PATH 10085 sshd 3 0 /lib/x86_64-linux-gnu/libkeyutils.so.1 10085 sshd 3 0 /lib/x86_64-linux-gnu/libresolv.so.2 10085 sshd 3 0 /lib/x86_64-linux-gnu/libgpg-error.so.0 10085 sshd 3 0 /dev/urandom 10085 sshd -1 2 /lib/x86_64-linux-gnu/.libcrypto.so.1.0.0.hmac 10085 sshd -1 2 /proc/sys/crypto/fips_enabled 10085 sshd 3 0 /proc/filesystems 10085 sshd 3 0 /dev/null 10085 sshd 3 0 /proc/10085/fd 10085 sshd 3 0 /usr/lib/ssl/openssl.cnf 10085 sshd 3 0 /etc/gai.conf 10085 sshd 3 0 /etc/nsswitch.conf 10085 sshd 3 0 /etc/ld.so.cache 10085 sshd 3 0 /lib/x86_64-linux-gnu/libnss_compat.so.2 10085 sshd 3 0 /etc/ld.so.cache 10085 sshd 3 0 /lib/x86_64-linux-gnu/libnss_nis.so.2 […]
  • 18. Template 2: Filtered Event Output # ext4slower 1 Tracing ext4 operations slower than 1 ms TIME COMM PID T BYTES OFF_KB LAT(ms) FILENAME 06:49:17 bash 3616 R 128 0 7.75 cksum 06:49:17 cksum 3616 R 39552 0 1.34 [ 06:49:17 cksum 3616 R 96 0 5.36 2to3-2.7 06:49:17 cksum 3616 R 96 0 14.94 2to3-3.4 06:49:17 cksum 3616 R 10320 0 6.82 411toppm 06:49:17 cksum 3616 R 65536 0 4.01 a2p 06:49:17 cksum 3616 R 55400 0 8.77 ab 06:49:17 cksum 3616 R 36792 0 16.34 aclocal-1.14 06:49:17 cksum 3616 R 15008 0 19.31 acpi_listen 06:49:17 cksum 3616 R 6123 0 17.23 add-apt-repository 06:49:17 cksum 3616 R 6280 0 18.40 addpart 06:49:17 cksum 3616 R 27696 0 2.16 addr2line 06:49:17 cksum 3616 R 58080 0 10.11 ag 06:49:17 cksum 3616 R 906 0 6.30 ec2-meta-data 06:49:17 cksum 3616 R 6320 0 10.00 animate.im6 […]
  • 19. Template 3: Interval Summary # dcstat TIME REFS/s SLOW/s MISS/s HIT% 08:11:47: 2059 141 97 95.29 08:11:48: 79974 151 106 99.87 08:11:49: 192874 146 102 99.95 08:11:50: 2051 144 100 95.12 08:11:51: 73373 17239 17194 76.57 08:11:52: 54685 25431 25387 53.58 08:11:53: 18127 8182 8137 55.12 08:11:54: 22517 10345 10301 54.25 08:11:55: 7524 2881 2836 62.31 08:11:56: 2067 141 97 95.31 08:11:57: 2115 145 101 95.22 […]
  • 20. Template 4: Count Summary # funccount 'vfs_*' Tracing... Ctrl-C to end. ^C ADDR FUNC COUNT ffffffff811efe81 vfs_create 1 ffffffff811f24a1 vfs_rename 1 ffffffff81215191 vfs_fsync_range 2 ffffffff81231df1 vfs_lock_file 30 ffffffff811e8dd1 vfs_fstatat 152 ffffffff811e8d71 vfs_fstat 154 ffffffff811e4381 vfs_write 166 ffffffff811e8c71 vfs_getattr_nosec 262 ffffffff811e8d41 vfs_getattr 262 ffffffff811e3221 vfs_open 264 ffffffff811e4251 vfs_read 470 Detaching...
  • 21. Template 5: Histogram Summary # biolatency Tracing block device I/O... Hit Ctrl-C to end. ^C usecs : count distribution 4 -> 7 : 0 | | 8 -> 15 : 0 | | 16 -> 31 : 0 | | 32 -> 63 : 0 | | 64 -> 127 : 1 | | 128 -> 255 : 12 |******** | 256 -> 511 : 15 |********** | 512 -> 1023 : 43 |******************************* | 1024 -> 2047 : 52 |**************************************| 2048 -> 4095 : 47 |********************************** | 4096 -> 8191 : 52 |**************************************| 8192 -> 16383 : 36 |************************** | 16384 -> 32767 : 15 |********** | 32768 -> 65535 : 2 |* | 65536 -> 131071 : 2 |* |
  • 23. Template 7: Folded stack output for flame graphs offcputime -f offwaketime -f wakeuptime -f profile -f | flamegraph.pl > out.svg
  • 24. Valuable Know what already exists and what doesn't
  • 26. Concise, intuitive self-explanatory # iolatency Tracing block I/O. Output every 1 seconds. Ctrl-C to end. >=(ms) .. <(ms) : I/O |Distribution | 0 -> 1 : 4381 |######################################| 1 -> 2 : 9 |# | 2 -> 4 : 5 |# | 4 -> 8 : 0 | | 8 -> 16 : 1 |# | […]
  • 27. # ./biolatency -h usage: biolatency [-h] [-T] [-Q] [-m] [-D] [interval] [count] Summarize block device I/O latency as a histogram positional arguments: interval output interval, in seconds count number of outputs optional arguments: -h, --help show this help message and exit -T, --timestamp include timestamp on output -Q, --queued include OS queued time in I/O time -m, --milliseconds millisecond histogram -D, --disks print a histogram per disk device examples: ./biolatency # summarize block I/O latency as a histogram ./biolatency 1 10 # print 1 second summaries, 10 times ./biolatency -mT 1 # 1s summaries, milliseconds, and timestamps ./biolatency -Q # include OS queued time in I/O time ./biolatency -D # show each disk device separately .POSIX-style. arguments
  • 28. Op>on Alternate Expecta>on -a --all all events -c CMD --cmd … run this command -d SECONDS --duraAon … duraAon of tool execuAon -h --help help -i FILE --input … input file -i SECONDS --interval … summary interval -n name --name … this process name only -o FILE --output … output file -p PID --pid … this process ID only -P --by-process per-process ID breakdown -P PORT --port … this TCP port only -t or -T --[no]Amestamp include or exclude Amestamps -v --verbose verbose output -x --extended, --errors extended output, or only failures [interval [count]] - summary interval, and # of outputs
  • 29. Tested If you can't write the workload, you can't write the tool
  • 31. State of BPF, Feb 2017 1.  Dynamic tracing, kernel-level (BPF support for kprobes) 2.  Dynamic tracing, user-level (BPF support for uprobes) 3.  StaAc tracing, kernel-level (BPF support for tracepoints) 4.  Timed sampling events (BPF with perf_event_open) 5.  PMC events (BPF with perf_event_open) 6.  Filtering (via BPF programs) 7.  Debug output (bpf_trace_printk()) 8.  Per-event output (bpf_perf_event_output()) 9.  Basic variables (global & per-thread variables, via BPF maps) 10.  AssociaAve arrays (via BPF maps) 11.  Frequency counAng (via BPF maps) 12.  Histograms (power-of-2, linear, and custom, via BPF maps) 13.  Timestamps and Ame deltas (bpf_kAme_get_() and BPF) 14.  Stack traces, kernel (BPF stackmap) 15.  Stack traces, user (BPF stackmap) 16.  Overwrite ring buffers 17.  String factory (stringmap) 18.  OpAonal: bounded loops, < and <=, … 1.  StaAc tracing, user-level (USDT probes via uprobes) 2.  StaAc tracing, dynamic USDT (needs library support) 3.  Debug output (Python with BPF.trace_pipe() and BPF.trace_fields()) 4.  Per-event output (BPF_PERF_OUTPUT macro and BPF.open_perf_buffer()) 5.  Interval output (BPF.get_table() and table.clear()) 6.  Histogram prinAng (table.print_log2_hist()) 7.  C struct navigaAon, kernel-level (maps to bpf_probe_read()) 8.  Symbol resoluAon, kernel-level (ksym(), ksymaddr()) 9.  Symbol resoluAon, user-level (usymaddr()) 10.  BPF tracepoint support (via TRACEPOINT_PROBE) 11.  BPF stack trace support (incl. walk method for stack frames) 12.  Examples (under /examples) 13.  Many tools (/tools) 14.  Tutorials (/docs/tutorial*.md) 15.  Reference guide (/docs/reference_guide.md) 16.  Open issues: (h)ps://github.com/iovisor/bcc/issues) State of bcc, Feb 2017 done not yet
  • 32. Dynamic tracing stability need those smoke tests switch tools to static tracepoints
  • 39. Visualizations and GUIs Flame Graphs Tracing Reports … Eg, Nejlix self-service UI:
  • 40. Ancient Linux Linux 3.18 Linux 3.10 Linux 3.2 Linux 2.6.x
  • 42. Finish porting my old DTrace tools
  • 43. Links & References iovisor bcc: •  https://github.com/iovisor/bcc •  https://github.com/iovisor/bcc/tree/master/docs •  http://www.brendangregg.com/blog/ (search for "bcc") •  http://blogs.microsoft.co.il/sasha/2016/02/14/two-new-ebpf-tools-memleak-and-argdist/ •  I'll change your view of Linux tracing: https://www.youtube.com/watch?v=GsMs3n8CB6g •  On designing tracing tools: https://www.youtube.com/watch?v=uibLwoVKjec BPF: •  https://www.kernel.org/doc/Documentation/networking/filter.txt •  https://github.com/iovisor/bpf-docs •  https://suchakra.wordpress.com/tag/bpf/ Flame Graphs: •  http://www.brendangregg.com/flamegraphs.html •  http://www.brendangregg.com/blog/2016-01-20/ebpf-offcpu-flame-graph.html •  http://www.brendangregg.com/blog/2016-02-01/linux-wakeup-offwake-profiling.html Linux Performance: http://www.brendangregg.com/linuxperf.html
  • 44. Thanks Discussion? iovisor bcc: https://github.com/iovisor/bcc http://www.brendangregg.com http://slideshare.net/brendangregg bgregg@netflix.com @brendangregg Thanks to Alexei Starovoitov (Facebook), Brenden Blanco (PLUMgrid/VMware), Sasha Goldshtein (Sela), Daniel Borkmann (Cisco), Wang Nan (Huawei), and other BPF and bcc contributors!