Contenu connexe




Using eBPF to Measure the k8s Cluster Health

  1. Brought to you by Using eBPF to Measure the k8s Cluster Health Henrik Rexed Cloud Native Advocate at
  2. Who is this speaker? ■ Henrik Rexed ■ 15 years of Performance engineering ■ Cloud Native Advocate @Dynatrace ■ Lego Fan ■ Focus on SRE and Observability
  3. Abstract (Hidden) As a k8s cluster-admin your app teams have a certain expectation of your cluster to be available to deploy services at any time without problems. While there is no shortage on metrics in k8s its important to have the right metrics to alert on issues and giving you enough data to react to potential availability issues. Prometheus has become a standard and sheds light on the inner behaviour of Kubernetes clusters and workloads. Lots of KPIs (CPU, IO, network. Etc) in our On-Premise environment are less precise when we start to work in a Cloud environment. Ebpf is the perfect technology that fulfills that requirement as it gives us information down to the kernel level. In 2018 Cloudflare shared an opensource project to expose custom ebpf metrics in Prometheus. Join this session and learn about: • What is ebpf? • What type of metrics we can collect? • How to expose those metrics in a K8s environment. This session will try to deliver a step-by-step guide on how to take advantage of the ebpf exporter.
  4. ■ What is Ebpf? ■ What is Prometheus? ■ How to expose Ebpf metrics in Prometheus ? If you stay with me you will learn...
  5. Detect node running out of resource in our kubernetes cluster Evicted
  6. There are technics minimize the impact on critical components ■ Define Resource ● Limit ● Request ■ Define PodClassPrority on you Pods ■ Define Quotas on your Nodes and Namespaces
  7. « CPU Utilization is wrong » Brendon Gregg ■ We are all using CPU utilization to monitor the health of our cluster/servers… but : Could mean : ■ What are we waiting for ? ● IO ● Memory ● Network? ■ So, CPU utilization is wrong ? It’s a time to collect data that we help us understand what is really happening on our systems. Busy idle Busy Waiting"stalled" Waiting"Idle"
  8. What is eBPF? Liz Rice, IsoValent Brendon Gregg, Netflix Thomas Graf IsoValent, Cilium
  9. Let’s remind the linux kernel architecture User Space Linux Kernel hardware Process Process Storage Network Block device Network device File Syscall Write( ) Read( ) Socket Syscall Sendmsg( ) recvmsg( )
  10. BPF program are attached to a syscall event User Space Process Linux Kernel Block Device Syscall Write() Vfs_rea d
  11. How to report back metrics collected by the BPF program? ■ BPF MAP ● HashMap, Array ● Stack Trace ● …etc ■ With BPF MAP you can store: ● Program state ● Metrics, ● Statistics ● Traces ● BPF Map can be shared with user space program User Space Process Linux Kernel Block Device Syscall Write( ) Vfs_read BPF map
  12. How do you create a BPF Program with bcc ? User Space Process Linux Kernel Block Device Write( ) BPF map BCC BPF program Verifier JIT Compiler Syscall Syscall
  13. What is Prometheus? TechWorld with Nana
  14. Prometheus Architecture Kube State metrics Node exporter Cadvisor App Third party Alertmanager Etc. Scrap Prometheus Server PromQl
  15. Prometheus is becoming a standard •CouchDb •Mysql •Oracle •PostgreSQL •MongoDB •… Database •Netgear •Windows •IBM Z •Nvidia •….etc Hardware •MQ •Kafka •MQTT •RabbitMQ •…etc Messagin g •Tivoli •Hadoop •NetApp •ScaleIO Storage •Jira •Jenkins •Github •Fluentd •Nagios •…etc Others
  16. Let’s bring automation ■ Prometheus & Grafana could be configured directly from configuration files: Prometheus yaml Alertmanager.yaml Dashboard.json
  17. ServiceMonitor map all the services that expose Prometheus metrics. The serviceMonitor has a label selector to select services and the information required to scrap the metrics : ■ Port ■ interval Scraping configuration in K8s
  18. Let’s use the Existing project exposing metrics
  19. The CloudFlare Ebpf Exporter ■ The purpose is to run to run bcc tools to collect metrics and expose them in prometheus ■ This exporter could be used : ● Physical server ● VM ● Containers ● Kubernetes ■ The github repository has several examples to collect : ● Bio latency ● Cachestat ● Cpu instructions ● …etc cloudflare/ebpf_exporter: Prometheus exporter for custom eBPF metrics (
  20. What does the deployment looks like in k8s? ConfigMap Daemonset Service Scraping annotation BPF program ServiceMonitor
  21. Your BPF program will be defined in the config map Data ( config.yaml) Program List of metrics Kprobe Knetprobe Raw tracepoints perfevents Code ( c language) config.yaml: | programs: - name: timers metrics: counters: - name: timer_start_total help: Timers fired in the kernel table: counts labels: - name: function size: 8 decoders: - name: ksym raw_tracepoints: timer_start: raw_tracepoint-_timer_start code: | #include <linux/timer.h> BPF_HASH(counts, u64); RAW_TRACEPOINT_PROBE(timer_start) { -/ TP_PROTO(struct timer_list *timer, -/ unsigned long expires, -/ unsigned int flags), struct timer_list *timer = (struct timer_list *)ctx->args[0]; void * function = timer->function; counts.increment((u64) function); return 0; }
  22. The Cgnet project ■ The purpose is to run to run metrics related to the cgroups of the various pods running in our cluster ■ The github repository has all the examples to deploy the cgnet in your cluster : kinvolk-archives/cgnet (
  23. How does the deployment looks like in k8s? Daemonset Service ServiceMonitor
  24. Step by step approach Let’s create a ServiceMonitor to collect the ebpf-exporter 1 apiVersion: kind: ServiceMonitor metadata: name: tracer-servicemon labels: release: prometheus spec: endpoints: - interval: 10s port: metrics selector: matchLabels: app: bpf-exporter Prometheus 2 Create a Dashboard 3
  25. Brought to you by Henrik Rexed @hrexed