SlideShare une entreprise Scribd logo
1  sur  46
Télécharger pour lire hors ligne
Implementing Observability
for Kubernetes
José Manuel Ortega(@jmortegac)
Agenda
● Introducing the concept of observability
● Implementing Kubernetes observability
● Observability stack in K8s
● Integrating Prometheus with
OpenTelemetry
Introducing the concept of observability
● Software architecture is more complex.
● Pillars of observability—logs, metrics,
and traces.
● Observability is now a top priority for
DevOps teams.
Introducing the concept of observability
● Monitoring
● Logging
● Tracing
Introducing the concept of observability
● time=”2019-12-23T01:27:38-04:00″
level=debug msg=”Application starting”
environment=dev
● http_requests_total=100
log metric
Introducing the concept of observability
Implementing Kubernetes observability
1. Node status. Current health status and availability of the
node.
2. Node resource usage metrics. Disk and memory
utilization, CPU and network bandwidth.
3. Implementation status. Current and desired state of the
deployments in the cluster.
4. Number of pods. Kubernetes internal components and
processes use this information to manage the workload and
schedule the pods.
Implementing Kubernetes observability
1. Kubernetes metrics. These metrics apply to the number
and types of resources within a pod. This metric includes
resource limit tracking to avoid running out of system
resources.
2. Container metrics. These metrics capture the utilization of
container-level resources, such as CPU, memory, and
network usage.
3. Application metrics. Such metrics include the number of
active or online users and response times.
Implementing Kubernetes observability
Implementing Kubernetes observability
Implementing Kubernetes observability
Observability stack in K8s
● Kubewatch is an open-source Kubernetes monitoring
tool that sends notifications about changes in a
Kubernetes cluster to various communication channels,
such as Slack, Microsoft Teams, or email.
● It monitors Kubernetes resources, such as deployments,
services, and pods, and alerts users in real-time when
changes occur.
https://github.com/vmware-archive/kubewatch
Observability stack in K8s
https://github.com/salesforce/sloop
Observability stack in K8s
● Jaeger is an open-source distributed tracing system
● The tool is designed to monitor and troubleshoot
distributed microservices, mostly focusing on:
○ Distributed context propagation
○ Distributed transaction monitoring
○ Root cause analysis
○ Service dependency analysis
○ Performance/latency optimization
https://www.jaegertracing.io
Observability stack in K8s
Observability stack in K8s
Observability stack in K8s
https://www.jaegertracing.io/docs/1.46/operator
apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
name: simplest
Observability stack in K8s
● Fluentd is an open-source data collector for
unified logging layers.
● It works with Kubernetes running as
DaemonSet. This combination ensures that all
nodes run one copy of a pod.
https://www.fluentd.org
Observability stack in K8s
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: fluentd
namespace: kube-system
spec:
containers:
– name: fluentd
image:
quay.io/fluent/fluentd-kubernetes-daemonset
Observability stack in K8s
Observability stack in K8s
● Prometheus is a cloud native time series data
store with built-in rich query language for
metrics.
● Collecting data with Prometheus opens up many
possibilities for increasing the observability of
your infrastructure and the containers running in
Kubernetes cluster.
https://prometheus.io
Observability stack in K8s
● Multi-dimensional data model
● Prometheus query language(PromQL)
● Data collection
● Storage
● Visualization(Grafana)
https://prometheus.io
Observability stack in K8s
Observability stack in K8s
● Most of the metrics can be exported using node_exporter
https://github.com/prometheus/node_exporter and cAdvisor
https://github.com/google/cadvisor
○ Resource utilization saturation. The containers’ resource
consumption and allocation.
○ The number of failing pods and errors within a specific
namespace.
○ Kubernetes resource capacity. The total number of
nodes, CPU cores, and memory available.
Observability stack in K8s
Observability stack in K8s
● Service dependencies & communication map
○ What services are communicating with each other?
○ What HTTP calls are being made?
● Operational monitoring & alerting
○ Is any network communication failing?
○ Is the communication broken on layer 4 (TCP) or layer 7 (HTTP)?
● Application monitoring
○ What is the rate of 5xx or 4xx HTTP response codes for a
particular service or across all clusters?
● Security observability
○ Which services had connections blocked due to network policy?
https://github.com/cilium/hubble
Observability stack in K8s
https://github.com/cilium/hubble
Observability stack in K8s
https://github.com/cilium/hubble
Service Dependency Graph
Observability stack in K8s
https://github.com/cilium/hubble
Networking Behavior
Observability stack in K8s
https://github.com/cilium/hubble
HTTP Request/Response Rate & Latency
Integrating Prometheus with OpenTelemetry
Integrating Prometheus with OpenTelemetry
Integrating Prometheus with OpenTelemetry
● Receivers: are the data sources of observability
information.
● Processors: they process the information received before it
is exported to the different backends.
● Exporters: they are in charge of exporting the information to
the different backends, such as Jaeger or Kafka
Integrating Prometheus with Open
https://github.com/open-telemetry/opentelemetry-collector
otel-collector:
image: otel/opentelemetry-collector:latest
command: [ "--config=/etc/otel-collector-config.yaml" ]
volumes:
- ./otel-collector-config.yaml:/etc/otel-collector-config.yaml:Z
ports:
- "13133:13133"
- "4317:4317"
- "4318:4318"
depends_on:
- jaeger
Integrating Prometheus with OpenTelemetry
otel-collector-config.yaml
processors:
batch:
extensions:
health_check:
service:
extensions: [health_check]
pipelines:
traces:
receivers: [otlp]
processors: [batch]
exporters: [jaeger]
receivers:
otlp:
protocols:
grpc:
endpoint: otel-collector:4317
exporters:
jaeger:
endpoint: jaeger:14250
tls:
insecure: true
Integrating Prometheus with OpenTelemetry
otel-collector-config.yaml
Integrating Prometheus with OpenTelemetry
https://github.com/open-telemetry/opentelemetry-collector/tree/main/processor
Integrating Prometheus with OpenTelemetry
Integrating Prometheus with OpenTelemetry
receivers:
..
prometheus:
config:
scrape_configs:
- job_name: 'service-a'
scrape_interval: 2s
metrics_path: '/metrics/prometheus'
static_configs:
- targets: [ 'service-a:8080' ]
- job_name: 'service-b'
scrape_interval: 2s
metrics_path: '/actuator/prometheus'
static_configs:
- targets: [ 'service-b:8081' ]
- job_name: 'service-c'
scrape_interval: 2s
Integrating Prometheus with OpenTelemetry
exporters:
…
prometheusremotewrite:
endpoint: http://prometheus:9090/api/v1/write
tls:
insecure: true
● active in Prometheus “--web.enable-remote-write-receiver”
Integrating Prometheus with OpenTelemetry
https://github.com/open-telemetry/opentelemetry-demo
Integrating Prometheus with OpenTelemetry
https://github.com/open-telemetry/opentelemetry-demo
Integrating Prometheus with OpenTelemetry
https://github.com/open-telemetry/opentelemetry-demo
Integrating Prometheus with OpenTelemetry
https://github.com/open-telemetry/opentelemetry-demo
Conclusions
● Lean on the native capabilities of Kubernetes for the
collection and exploitation of metrics in order to know
the state of health of your pods and, in general, of your
cluster.
● Use these metrics to be able to create alarms that
proactively notify us of errors or even allow us to
anticipate issues in our applications or infraestructure.
¡Thank you!
@jmortegac
https://www.linkedin.com
/in/jmortega1
https://jmortega.github.io

Contenu connexe

Similaire à Implementing Observability for Kubernetes.pdf

Using Kubernetes to make cellular data plans cheaper for 50M users
Using Kubernetes to make cellular data plans cheaper for 50M usersUsing Kubernetes to make cellular data plans cheaper for 50M users
Using Kubernetes to make cellular data plans cheaper for 50M usersMirantis
 
Kubernetes One-Click Deployment: Hands-on Workshop (Mainz)
Kubernetes One-Click Deployment: Hands-on Workshop (Mainz)Kubernetes One-Click Deployment: Hands-on Workshop (Mainz)
Kubernetes One-Click Deployment: Hands-on Workshop (Mainz)QAware GmbH
 
Intro to GitOps with Weave GitOps, Flagger and Linkerd
Intro to GitOps with Weave GitOps, Flagger and LinkerdIntro to GitOps with Weave GitOps, Flagger and Linkerd
Intro to GitOps with Weave GitOps, Flagger and LinkerdWeaveworks
 
Microservices Part 4: Functional Reactive Programming
Microservices Part 4: Functional Reactive ProgrammingMicroservices Part 4: Functional Reactive Programming
Microservices Part 4: Functional Reactive ProgrammingAraf Karsh Hamid
 
08 - kubernetes.pptx
08 - kubernetes.pptx08 - kubernetes.pptx
08 - kubernetes.pptxRanjithM61
 
kubernetesssssssssssssssssssssssssss.pdf
kubernetesssssssssssssssssssssssssss.pdfkubernetesssssssssssssssssssssssssss.pdf
kubernetesssssssssssssssssssssssssss.pdfbchiriamina2
 
OSMC 2019 | Monitoring Cockpit for Kubernetes Clusters by Ulrike Klusik
OSMC 2019 | Monitoring Cockpit for Kubernetes Clusters by Ulrike KlusikOSMC 2019 | Monitoring Cockpit for Kubernetes Clusters by Ulrike Klusik
OSMC 2019 | Monitoring Cockpit for Kubernetes Clusters by Ulrike KlusikNETWAYS
 
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...GetInData
 
OpenTelemetry 101 FTW
OpenTelemetry 101 FTWOpenTelemetry 101 FTW
OpenTelemetry 101 FTWNGINX, Inc.
 
software defined network, openflow protocol and its controllers
software defined network, openflow protocol and its controllerssoftware defined network, openflow protocol and its controllers
software defined network, openflow protocol and its controllersIsaku Yamahata
 
Exploring the power of OpenTelemetry on Kubernetes
Exploring the power of OpenTelemetry on KubernetesExploring the power of OpenTelemetry on Kubernetes
Exploring the power of OpenTelemetry on KubernetesRed Hat Developers
 
Xpdays: Kubernetes CI-CD Frameworks Case Study
Xpdays: Kubernetes CI-CD Frameworks Case StudyXpdays: Kubernetes CI-CD Frameworks Case Study
Xpdays: Kubernetes CI-CD Frameworks Case StudyDenys Vasyliev
 
Monitoring kubernetes with prometheus-operator
Monitoring kubernetes with prometheus-operatorMonitoring kubernetes with prometheus-operator
Monitoring kubernetes with prometheus-operatorLili Cosic
 
2307 - DevBCN - Otel 101_compressed.pdf
2307 - DevBCN - Otel 101_compressed.pdf2307 - DevBCN - Otel 101_compressed.pdf
2307 - DevBCN - Otel 101_compressed.pdfDimitrisFinas1
 
Monitoring&Logging - Stanislav Kolenkin
Monitoring&Logging - Stanislav Kolenkin  Monitoring&Logging - Stanislav Kolenkin
Monitoring&Logging - Stanislav Kolenkin Kuberton
 
Zephyr Introduction - Nordic Webinar - Sept. 24.pdf
Zephyr Introduction - Nordic Webinar - Sept. 24.pdfZephyr Introduction - Nordic Webinar - Sept. 24.pdf
Zephyr Introduction - Nordic Webinar - Sept. 24.pdfAswathRangaraj1
 
Integrating Puppet and Gitolite for sysadmins cooperations
Integrating Puppet and Gitolite for sysadmins cooperationsIntegrating Puppet and Gitolite for sysadmins cooperations
Integrating Puppet and Gitolite for sysadmins cooperationsLuca Mazzaferro
 

Similaire à Implementing Observability for Kubernetes.pdf (20)

Using Kubernetes to make cellular data plans cheaper for 50M users
Using Kubernetes to make cellular data plans cheaper for 50M usersUsing Kubernetes to make cellular data plans cheaper for 50M users
Using Kubernetes to make cellular data plans cheaper for 50M users
 
Kubernetes One-Click Deployment: Hands-on Workshop (Mainz)
Kubernetes One-Click Deployment: Hands-on Workshop (Mainz)Kubernetes One-Click Deployment: Hands-on Workshop (Mainz)
Kubernetes One-Click Deployment: Hands-on Workshop (Mainz)
 
Intro to GitOps with Weave GitOps, Flagger and Linkerd
Intro to GitOps with Weave GitOps, Flagger and LinkerdIntro to GitOps with Weave GitOps, Flagger and Linkerd
Intro to GitOps with Weave GitOps, Flagger and Linkerd
 
Microservices Part 4: Functional Reactive Programming
Microservices Part 4: Functional Reactive ProgrammingMicroservices Part 4: Functional Reactive Programming
Microservices Part 4: Functional Reactive Programming
 
08 - kubernetes.pptx
08 - kubernetes.pptx08 - kubernetes.pptx
08 - kubernetes.pptx
 
kubernetesssssssssssssssssssssssssss.pdf
kubernetesssssssssssssssssssssssssss.pdfkubernetesssssssssssssssssssssssssss.pdf
kubernetesssssssssssssssssssssssssss.pdf
 
OSMC 2019 | Monitoring Cockpit for Kubernetes Clusters by Ulrike Klusik
OSMC 2019 | Monitoring Cockpit for Kubernetes Clusters by Ulrike KlusikOSMC 2019 | Monitoring Cockpit for Kubernetes Clusters by Ulrike Klusik
OSMC 2019 | Monitoring Cockpit for Kubernetes Clusters by Ulrike Klusik
 
Kubernetes PPT.pptx
Kubernetes PPT.pptxKubernetes PPT.pptx
Kubernetes PPT.pptx
 
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
 
Monitoring Cockpit for OpenShift Clusters
Monitoring Cockpit for OpenShift ClustersMonitoring Cockpit for OpenShift Clusters
Monitoring Cockpit for OpenShift Clusters
 
OpenTelemetry 101 FTW
OpenTelemetry 101 FTWOpenTelemetry 101 FTW
OpenTelemetry 101 FTW
 
software defined network, openflow protocol and its controllers
software defined network, openflow protocol and its controllerssoftware defined network, openflow protocol and its controllers
software defined network, openflow protocol and its controllers
 
Exploring the power of OpenTelemetry on Kubernetes
Exploring the power of OpenTelemetry on KubernetesExploring the power of OpenTelemetry on Kubernetes
Exploring the power of OpenTelemetry on Kubernetes
 
Xpdays: Kubernetes CI-CD Frameworks Case Study
Xpdays: Kubernetes CI-CD Frameworks Case StudyXpdays: Kubernetes CI-CD Frameworks Case Study
Xpdays: Kubernetes CI-CD Frameworks Case Study
 
Monitoring kubernetes with prometheus-operator
Monitoring kubernetes with prometheus-operatorMonitoring kubernetes with prometheus-operator
Monitoring kubernetes with prometheus-operator
 
2307 - DevBCN - Otel 101_compressed.pdf
2307 - DevBCN - Otel 101_compressed.pdf2307 - DevBCN - Otel 101_compressed.pdf
2307 - DevBCN - Otel 101_compressed.pdf
 
Monitoring&Logging - Stanislav Kolenkin
Monitoring&Logging - Stanislav Kolenkin  Monitoring&Logging - Stanislav Kolenkin
Monitoring&Logging - Stanislav Kolenkin
 
Zephyr Introduction - Nordic Webinar - Sept. 24.pdf
Zephyr Introduction - Nordic Webinar - Sept. 24.pdfZephyr Introduction - Nordic Webinar - Sept. 24.pdf
Zephyr Introduction - Nordic Webinar - Sept. 24.pdf
 
Monitoring in 2017 - TIAD Camp Docker
Monitoring in 2017 - TIAD Camp DockerMonitoring in 2017 - TIAD Camp Docker
Monitoring in 2017 - TIAD Camp Docker
 
Integrating Puppet and Gitolite for sysadmins cooperations
Integrating Puppet and Gitolite for sysadmins cooperationsIntegrating Puppet and Gitolite for sysadmins cooperations
Integrating Puppet and Gitolite for sysadmins cooperations
 

Plus de Jose Manuel Ortega Candel

Asegurando tus APIs Explorando el OWASP Top 10 de Seguridad en APIs.pdf
Asegurando tus APIs Explorando el OWASP Top 10 de Seguridad en APIs.pdfAsegurando tus APIs Explorando el OWASP Top 10 de Seguridad en APIs.pdf
Asegurando tus APIs Explorando el OWASP Top 10 de Seguridad en APIs.pdfJose Manuel Ortega Candel
 
PyGoat Analizando la seguridad en aplicaciones Django.pdf
PyGoat Analizando la seguridad en aplicaciones Django.pdfPyGoat Analizando la seguridad en aplicaciones Django.pdf
PyGoat Analizando la seguridad en aplicaciones Django.pdfJose Manuel Ortega Candel
 
Ciberseguridad en Blockchain y Smart Contracts: Explorando los Desafíos y Sol...
Ciberseguridad en Blockchain y Smart Contracts: Explorando los Desafíos y Sol...Ciberseguridad en Blockchain y Smart Contracts: Explorando los Desafíos y Sol...
Ciberseguridad en Blockchain y Smart Contracts: Explorando los Desafíos y Sol...Jose Manuel Ortega Candel
 
Evolution of security strategies in K8s environments- All day devops
Evolution of security strategies in K8s environments- All day devops Evolution of security strategies in K8s environments- All day devops
Evolution of security strategies in K8s environments- All day devops Jose Manuel Ortega Candel
 
Evolution of security strategies in K8s environments.pdf
Evolution of security strategies in K8s environments.pdfEvolution of security strategies in K8s environments.pdf
Evolution of security strategies in K8s environments.pdfJose Manuel Ortega Candel
 
Seguridad en arquitecturas serverless y entornos cloud
Seguridad en arquitecturas serverless y entornos cloudSeguridad en arquitecturas serverless y entornos cloud
Seguridad en arquitecturas serverless y entornos cloudJose Manuel Ortega Candel
 
Construyendo arquitecturas zero trust sobre entornos cloud
Construyendo arquitecturas zero trust sobre entornos cloud Construyendo arquitecturas zero trust sobre entornos cloud
Construyendo arquitecturas zero trust sobre entornos cloud Jose Manuel Ortega Candel
 
Tips and tricks for data science projects with Python
Tips and tricks for data science projects with Python Tips and tricks for data science projects with Python
Tips and tricks for data science projects with Python Jose Manuel Ortega Candel
 
Sharing secret keys in Docker containers and K8s
Sharing secret keys in Docker containers and K8sSharing secret keys in Docker containers and K8s
Sharing secret keys in Docker containers and K8sJose Manuel Ortega Candel
 
Python para equipos de ciberseguridad(pycones)
Python para equipos de ciberseguridad(pycones)Python para equipos de ciberseguridad(pycones)
Python para equipos de ciberseguridad(pycones)Jose Manuel Ortega Candel
 
Shodan Tips and tricks. Automatiza y maximiza las búsquedas shodan
Shodan Tips and tricks. Automatiza y maximiza las búsquedas shodanShodan Tips and tricks. Automatiza y maximiza las búsquedas shodan
Shodan Tips and tricks. Automatiza y maximiza las búsquedas shodanJose Manuel Ortega Candel
 
ELK para analistas de seguridad y equipos Blue Team
ELK para analistas de seguridad y equipos Blue TeamELK para analistas de seguridad y equipos Blue Team
ELK para analistas de seguridad y equipos Blue TeamJose Manuel Ortega Candel
 
Monitoring and managing Containers using Open Source tools
Monitoring and managing Containers using Open Source toolsMonitoring and managing Containers using Open Source tools
Monitoring and managing Containers using Open Source toolsJose Manuel Ortega Candel
 
Python memory managment. Deeping in Garbage collector
Python memory managment. Deeping in Garbage collectorPython memory managment. Deeping in Garbage collector
Python memory managment. Deeping in Garbage collectorJose Manuel Ortega Candel
 
Machine Learning para proyectos de seguridad(Pycon)
Machine Learning para proyectos de seguridad(Pycon)Machine Learning para proyectos de seguridad(Pycon)
Machine Learning para proyectos de seguridad(Pycon)Jose Manuel Ortega Candel
 

Plus de Jose Manuel Ortega Candel (20)

Asegurando tus APIs Explorando el OWASP Top 10 de Seguridad en APIs.pdf
Asegurando tus APIs Explorando el OWASP Top 10 de Seguridad en APIs.pdfAsegurando tus APIs Explorando el OWASP Top 10 de Seguridad en APIs.pdf
Asegurando tus APIs Explorando el OWASP Top 10 de Seguridad en APIs.pdf
 
PyGoat Analizando la seguridad en aplicaciones Django.pdf
PyGoat Analizando la seguridad en aplicaciones Django.pdfPyGoat Analizando la seguridad en aplicaciones Django.pdf
PyGoat Analizando la seguridad en aplicaciones Django.pdf
 
Ciberseguridad en Blockchain y Smart Contracts: Explorando los Desafíos y Sol...
Ciberseguridad en Blockchain y Smart Contracts: Explorando los Desafíos y Sol...Ciberseguridad en Blockchain y Smart Contracts: Explorando los Desafíos y Sol...
Ciberseguridad en Blockchain y Smart Contracts: Explorando los Desafíos y Sol...
 
Evolution of security strategies in K8s environments- All day devops
Evolution of security strategies in K8s environments- All day devops Evolution of security strategies in K8s environments- All day devops
Evolution of security strategies in K8s environments- All day devops
 
Evolution of security strategies in K8s environments.pdf
Evolution of security strategies in K8s environments.pdfEvolution of security strategies in K8s environments.pdf
Evolution of security strategies in K8s environments.pdf
 
Computación distribuida usando Python
Computación distribuida usando PythonComputación distribuida usando Python
Computación distribuida usando Python
 
Seguridad en arquitecturas serverless y entornos cloud
Seguridad en arquitecturas serverless y entornos cloudSeguridad en arquitecturas serverless y entornos cloud
Seguridad en arquitecturas serverless y entornos cloud
 
Construyendo arquitecturas zero trust sobre entornos cloud
Construyendo arquitecturas zero trust sobre entornos cloud Construyendo arquitecturas zero trust sobre entornos cloud
Construyendo arquitecturas zero trust sobre entornos cloud
 
Tips and tricks for data science projects with Python
Tips and tricks for data science projects with Python Tips and tricks for data science projects with Python
Tips and tricks for data science projects with Python
 
Sharing secret keys in Docker containers and K8s
Sharing secret keys in Docker containers and K8sSharing secret keys in Docker containers and K8s
Sharing secret keys in Docker containers and K8s
 
Implementing cert-manager in K8s
Implementing cert-manager in K8sImplementing cert-manager in K8s
Implementing cert-manager in K8s
 
Python para equipos de ciberseguridad(pycones)
Python para equipos de ciberseguridad(pycones)Python para equipos de ciberseguridad(pycones)
Python para equipos de ciberseguridad(pycones)
 
Python para equipos de ciberseguridad
Python para equipos de ciberseguridad Python para equipos de ciberseguridad
Python para equipos de ciberseguridad
 
Shodan Tips and tricks. Automatiza y maximiza las búsquedas shodan
Shodan Tips and tricks. Automatiza y maximiza las búsquedas shodanShodan Tips and tricks. Automatiza y maximiza las búsquedas shodan
Shodan Tips and tricks. Automatiza y maximiza las búsquedas shodan
 
ELK para analistas de seguridad y equipos Blue Team
ELK para analistas de seguridad y equipos Blue TeamELK para analistas de seguridad y equipos Blue Team
ELK para analistas de seguridad y equipos Blue Team
 
Monitoring and managing Containers using Open Source tools
Monitoring and managing Containers using Open Source toolsMonitoring and managing Containers using Open Source tools
Monitoring and managing Containers using Open Source tools
 
Python Memory Management 101(Europython)
Python Memory Management 101(Europython)Python Memory Management 101(Europython)
Python Memory Management 101(Europython)
 
SecDevOps containers
SecDevOps containersSecDevOps containers
SecDevOps containers
 
Python memory managment. Deeping in Garbage collector
Python memory managment. Deeping in Garbage collectorPython memory managment. Deeping in Garbage collector
Python memory managment. Deeping in Garbage collector
 
Machine Learning para proyectos de seguridad(Pycon)
Machine Learning para proyectos de seguridad(Pycon)Machine Learning para proyectos de seguridad(Pycon)
Machine Learning para proyectos de seguridad(Pycon)
 

Dernier

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 

Dernier (20)

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 

Implementing Observability for Kubernetes.pdf

  • 2. Agenda ● Introducing the concept of observability ● Implementing Kubernetes observability ● Observability stack in K8s ● Integrating Prometheus with OpenTelemetry
  • 3. Introducing the concept of observability ● Software architecture is more complex. ● Pillars of observability—logs, metrics, and traces. ● Observability is now a top priority for DevOps teams.
  • 4. Introducing the concept of observability ● Monitoring ● Logging ● Tracing
  • 5. Introducing the concept of observability ● time=”2019-12-23T01:27:38-04:00″ level=debug msg=”Application starting” environment=dev ● http_requests_total=100 log metric
  • 6. Introducing the concept of observability
  • 7. Implementing Kubernetes observability 1. Node status. Current health status and availability of the node. 2. Node resource usage metrics. Disk and memory utilization, CPU and network bandwidth. 3. Implementation status. Current and desired state of the deployments in the cluster. 4. Number of pods. Kubernetes internal components and processes use this information to manage the workload and schedule the pods.
  • 8. Implementing Kubernetes observability 1. Kubernetes metrics. These metrics apply to the number and types of resources within a pod. This metric includes resource limit tracking to avoid running out of system resources. 2. Container metrics. These metrics capture the utilization of container-level resources, such as CPU, memory, and network usage. 3. Application metrics. Such metrics include the number of active or online users and response times.
  • 12. Observability stack in K8s ● Kubewatch is an open-source Kubernetes monitoring tool that sends notifications about changes in a Kubernetes cluster to various communication channels, such as Slack, Microsoft Teams, or email. ● It monitors Kubernetes resources, such as deployments, services, and pods, and alerts users in real-time when changes occur. https://github.com/vmware-archive/kubewatch
  • 13. Observability stack in K8s https://github.com/salesforce/sloop
  • 14. Observability stack in K8s ● Jaeger is an open-source distributed tracing system ● The tool is designed to monitor and troubleshoot distributed microservices, mostly focusing on: ○ Distributed context propagation ○ Distributed transaction monitoring ○ Root cause analysis ○ Service dependency analysis ○ Performance/latency optimization https://www.jaegertracing.io
  • 17. Observability stack in K8s https://www.jaegertracing.io/docs/1.46/operator apiVersion: jaegertracing.io/v1 kind: Jaeger metadata: name: simplest
  • 18. Observability stack in K8s ● Fluentd is an open-source data collector for unified logging layers. ● It works with Kubernetes running as DaemonSet. This combination ensures that all nodes run one copy of a pod. https://www.fluentd.org
  • 19. Observability stack in K8s apiVersion: extensions/v1beta1 kind: DaemonSet metadata: name: fluentd namespace: kube-system spec: containers: – name: fluentd image: quay.io/fluent/fluentd-kubernetes-daemonset
  • 21. Observability stack in K8s ● Prometheus is a cloud native time series data store with built-in rich query language for metrics. ● Collecting data with Prometheus opens up many possibilities for increasing the observability of your infrastructure and the containers running in Kubernetes cluster. https://prometheus.io
  • 22. Observability stack in K8s ● Multi-dimensional data model ● Prometheus query language(PromQL) ● Data collection ● Storage ● Visualization(Grafana) https://prometheus.io
  • 24. Observability stack in K8s ● Most of the metrics can be exported using node_exporter https://github.com/prometheus/node_exporter and cAdvisor https://github.com/google/cadvisor ○ Resource utilization saturation. The containers’ resource consumption and allocation. ○ The number of failing pods and errors within a specific namespace. ○ Kubernetes resource capacity. The total number of nodes, CPU cores, and memory available.
  • 26. Observability stack in K8s ● Service dependencies & communication map ○ What services are communicating with each other? ○ What HTTP calls are being made? ● Operational monitoring & alerting ○ Is any network communication failing? ○ Is the communication broken on layer 4 (TCP) or layer 7 (HTTP)? ● Application monitoring ○ What is the rate of 5xx or 4xx HTTP response codes for a particular service or across all clusters? ● Security observability ○ Which services had connections blocked due to network policy? https://github.com/cilium/hubble
  • 27. Observability stack in K8s https://github.com/cilium/hubble
  • 28. Observability stack in K8s https://github.com/cilium/hubble Service Dependency Graph
  • 29. Observability stack in K8s https://github.com/cilium/hubble Networking Behavior
  • 30. Observability stack in K8s https://github.com/cilium/hubble HTTP Request/Response Rate & Latency
  • 33. Integrating Prometheus with OpenTelemetry ● Receivers: are the data sources of observability information. ● Processors: they process the information received before it is exported to the different backends. ● Exporters: they are in charge of exporting the information to the different backends, such as Jaeger or Kafka
  • 35. https://github.com/open-telemetry/opentelemetry-collector otel-collector: image: otel/opentelemetry-collector:latest command: [ "--config=/etc/otel-collector-config.yaml" ] volumes: - ./otel-collector-config.yaml:/etc/otel-collector-config.yaml:Z ports: - "13133:13133" - "4317:4317" - "4318:4318" depends_on: - jaeger Integrating Prometheus with OpenTelemetry otel-collector-config.yaml
  • 36. processors: batch: extensions: health_check: service: extensions: [health_check] pipelines: traces: receivers: [otlp] processors: [batch] exporters: [jaeger] receivers: otlp: protocols: grpc: endpoint: otel-collector:4317 exporters: jaeger: endpoint: jaeger:14250 tls: insecure: true Integrating Prometheus with OpenTelemetry otel-collector-config.yaml
  • 37. Integrating Prometheus with OpenTelemetry https://github.com/open-telemetry/opentelemetry-collector/tree/main/processor
  • 39. Integrating Prometheus with OpenTelemetry receivers: .. prometheus: config: scrape_configs: - job_name: 'service-a' scrape_interval: 2s metrics_path: '/metrics/prometheus' static_configs: - targets: [ 'service-a:8080' ] - job_name: 'service-b' scrape_interval: 2s metrics_path: '/actuator/prometheus' static_configs: - targets: [ 'service-b:8081' ] - job_name: 'service-c' scrape_interval: 2s
  • 40. Integrating Prometheus with OpenTelemetry exporters: … prometheusremotewrite: endpoint: http://prometheus:9090/api/v1/write tls: insecure: true ● active in Prometheus “--web.enable-remote-write-receiver”
  • 41. Integrating Prometheus with OpenTelemetry https://github.com/open-telemetry/opentelemetry-demo
  • 42. Integrating Prometheus with OpenTelemetry https://github.com/open-telemetry/opentelemetry-demo
  • 43. Integrating Prometheus with OpenTelemetry https://github.com/open-telemetry/opentelemetry-demo
  • 44. Integrating Prometheus with OpenTelemetry https://github.com/open-telemetry/opentelemetry-demo
  • 45. Conclusions ● Lean on the native capabilities of Kubernetes for the collection and exploitation of metrics in order to know the state of health of your pods and, in general, of your cluster. ● Use these metrics to be able to create alarms that proactively notify us of errors or even allow us to anticipate issues in our applications or infraestructure.