Interactive Monitoring for Kubernetes

•

4 j'aime•6,593 vues

Talk by Tom Wilkie at KubeCon 2016. Abstract: The rise of “cloud native” applications (many targeting Kubernetes) has had knock-on effects on the ways development teams operate, and the complexity of monitoring your application. Order-of-magnitude increase in the number of moving parts and rate of change of the applications require us to reassess traditional techniques. In this talk we will discuss some of the challenges raised and different approaches to solving them. We’ll discuss: - the necessity for a common set of monitoring paradigms across all you microservices, and how this can reduce cognitive load and improve recovery-from-failure time. - the increased importance of automated, interactive visualization tools now that responsibilities for system architecture have been devolved to individuals and teams. - new exciting ideas in distributed tracing and how these can help deploy tracing across an ever-differentiated production environment. This talk will be hands on at command line, using open source tools and include anecdotes from real world experience. Outline: - "monitoring" - ie timeseries metrics, considerations etc - "visualisation" - interactive visualisation of the architecture of running systems - "tracing" - distributed tracing, its importance, and how to etc.

Technologie

Interactive Monitoring
(for Kubernetes)
tom@weave.works
@tom_wilkie

VisualisationMonitoring Tracing
0
25
50
75
100

Traditional 3-tier architecture
Incoming traffic
Load balancers
Application servers
Database & replica

Microservice architecture
Public APIWeb UI
NoSQL serversDatabase
Message
Broker
Services

Microservices should be
treated like cattle not pets

USE Method* - for every
resource, check:
• utilization,
• saturation, and
• errors
RED Method - for every
service, check request:
• rate,
• error (rate), and
• duration (distributions)
* http://www.brendangregg.com/usemethod.html
An alternative view

$Okay, but how? var rpcDurations = prometheus.NewHistogram(prometheus.HistogramOpts{ Name: "rpc_durations_histogram_microseconds", Help: "RPC latency distributions.", Buckets: prometheus.LinearBuckets(0, 100, 20), }) func init() { prometheus.MustRegister(rpcDurations) } func handleRequest(w http.ResponseWriter, r *http.Request) { begin := time.Now() ... rpcDurations.WithLabelValues(r.Method).Observe( float64(time.Since(begin).Nanoseconds())) }$

There must be a better way…
Kubeproxy
Replicas
incoming traffic from
other services

https://github.com/weaveworks/flux
Try it out!

Connection
Tracking
/home/weave # conntrack -E
[DESTROY] tcp 6 src=172.17.0.10 dst=10.128.0.1 sport=41066 dport=80 src=172.17.0.1 dst=172.17.0.10
[DESTROY] tcp 6 src=192.168.99.100 dst=192.168.99.100 sport=36236 dport=32778 src=172.17.0.8 dst=1
[DESTROY] tcp 6 src=172.17.0.10 dst=10.128.0.1 sport=41068 dport=80 src=172.17.0.1 dst=172.17.0.10
[DESTROY] tcp 6 src=192.168.99.100 dst=192.168.99.100 sport=52996 dport=32776 src=172.17.0.6 dst=1
[DESTROY] tcp 6 src=172.17.0.10 dst=10.128.0.1 sport=41070 dport=80 src=172.17.0.1 dst=172.17.0.10
[DESTROY] tcp 6 src=192.168.99.100 dst=192.168.99.100 sport=52998 dport=32776 src=172.17.0.6 dst=1
[DESTROY] tcp 6 src=172.17.0.10 dst=10.128.0.1 sport=41072 dport=80 src=172.17.0.1 dst=172.17.0.10
[DESTROY] tcp 6 src=192.168.99.100 dst=192.168.99.100 sport=57975 dport=32777 src=172.17.0.7 dst=1
[DESTROY] tcp 6 src=172.17.0.10 dst=10.128.0.1 sport=41074 dport=80 src=172.17.0.1 dst=172.17.0.10
/home/weave # cat /proc/net/tcp
sl local_address rem_address st tx_queue rx_queue tr tm->when retrnsmt uid timeout inode
0: 00000000:0016 00000000:0000 0A 00000000:00000000 00:00000000 00000000 0 0 16810 1 ffff
1: 0100007F:EB74 0100007F:0FC8 06 00000000:00000000 03:0000016D 00000000 0 0 0 3 ffff8800
2: 0100007F:EB69 0100007F:0FC8 01 00000000:00000000 00:00000000 00000000 0 0 307011 1 fff
3: 0100007F:EB7B 0100007F:0FC8 06 00000000:00000000 03:00000D27 00000000 0 0 0 3 ffff8800
4: 0100007F:EB7C 0100007F:0FC8 06 00000000:00000000 03:0000110E 00000000 0 0 0 3 ffff8800
5: 0100007F:EB67 0100007F:0FC8 01 00000000:00000000 00:00000000 00000000 0 0 306868 1 fff
6: 0100007F:EB76 0100007F:0FC8 06 00000000:00000000 03:00000556 00000000 0 0 0 3 ffff8800
7: 0100007F:EB7F 0100007F:0FC8 06 00000000:00000000 03:000014F7 00000000 0 0 0 3 ffff8800

all connections
from
/proc
conntrack
Connection
Tracking
load
balanced

https://github.com/weaveworks/scope
Try it out!
https://weave-scope-slack.herokuapp.com
v0.13.1
https://www.weave.works/products/weave-scope
https://scope.weave.works

What does this look like with
Kubernetes?

Not a new topic
• Lots of literature
• Existing open source
projects
• e.g. Zipkin, originally from
Twitter

• Challenge: detecting
causality between
incoming and outgoing
requests
• Existing solutions require
propagation of some
unique ID (dapper, zipkin)
• This requires application-
specific modifications
some service
incoming
request
outgoing
requests
?

Can this be done without
application modifications?

By intercepting systems calls,
build up a data structure of:
• which threads are reading
to / writing from which FDs
• which FDs are talking to
which IPs & ports
Use this to infer causality
between incoming and
outgoing connections.
some service
kernel
?
System calls

Try it out!
https://github.com/weaveworks/scope/tree/master/experimental/tracer

@weaveworks github.com/weaveworks
Questions?
tom@weave.works
@tom_wilkie

https://weave-scope-slack.herokuapp.com
https://github.com/weaveworks/scope
https://github.com/weaveworks/flux
https://scope.weave.works
Links
https://github.com/weaveworks/scope/tree/master/experimental/tracer

Recommandé

Automating Security in Cloud Workloads with DevSecOpsAmazon Web Services

Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)Brian Brazil

Weave AI Controllers (Weave GitOps Office Hours)Weaveworks

Flamingo: Expand ArgoCD with Flux (Office Hours)Weaveworks

Webinar: Capabilities, Confidence and Community – What Flux GA Means for YouWeaveworks

Six Signs You Need Platform EngineeringWeaveworks

SRE and GitOps for Building Robust Kubernetes Platforms.pdfWeaveworks

Webinar: End to End Security & Operations with Chainguard and Weave GitOpsWeaveworks

Recommandé

Automating Security in Cloud Workloads with DevSecOpsAmazon Web Services

Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)Brian Brazil

Weave AI Controllers (Weave GitOps Office Hours)Weaveworks

Flamingo: Expand ArgoCD with Flux (Office Hours)Weaveworks

Webinar: Capabilities, Confidence and Community – What Flux GA Means for YouWeaveworks

Six Signs You Need Platform EngineeringWeaveworks

SRE and GitOps for Building Robust Kubernetes Platforms.pdfWeaveworks

Webinar: End to End Security & Operations with Chainguard and Weave GitOpsWeaveworks

Flux Beyond Git Harnessing the Power of OCIWeaveworks

Automated Provisioning, Management & Cost Control for Kubernetes ClustersWeaveworks

How to Avoid Kubernetes Multi-tenancy CatastrophesWeaveworks

Building internal developer platform with EKS and GitOpsWeaveworks

GitOps Testing in Kubernetes with Flux and Testkube.pdfWeaveworks

Intro to GitOps with Weave GitOps, Flagger and LinkerdWeaveworks

Implementing Flux for Scale with Soft Multi-tenancyWeaveworks

Accelerating Hybrid Multistage Delivery with Weave GitOps on EKSWeaveworks

The Story of Flux Reaching Graduation in the CNCFWeaveworks

Shift Deployment Security Left with Weave GitOps & Upbound’s Universal Crossp...Weaveworks

Securing Your App Deployments with Tunnels, OIDC, RBAC, and Progressive Deliv...Weaveworks

Flux’s Security & Scalability with OCI & Helm Slides.pdfWeaveworks

Flux Security & Scalability using VS Code GitOps Extension Weaveworks

Deploying Stateful Applications Securely & Confidently with Ondat & Weave GitOpsWeaveworks

Robust Network Security and Observability with GitOps and CiliumWeaveworks

Intro to GitOps & Flux.pdfWeaveworks

Simplifying Hybrid Kubernetes with Weaveworks and EKS.pdfWeaveworks

Weave GitOps 2022.09 Release: A Fast & Reliable Path to Production with Progr...Weaveworks

Building a Security First Approach Across Hybrid Cloud with GitOps and Policy...Weaveworks

Security & Resiliency of Cloud Native Apps with Weave GitOps & Tetrate Servic...Weaveworks

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia

Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1

Contenu connexe

Plus de Weaveworks

Flux Beyond Git Harnessing the Power of OCIWeaveworks

Automated Provisioning, Management & Cost Control for Kubernetes ClustersWeaveworks

How to Avoid Kubernetes Multi-tenancy CatastrophesWeaveworks

Building internal developer platform with EKS and GitOpsWeaveworks

GitOps Testing in Kubernetes with Flux and Testkube.pdfWeaveworks

Intro to GitOps with Weave GitOps, Flagger and LinkerdWeaveworks

Implementing Flux for Scale with Soft Multi-tenancyWeaveworks

Accelerating Hybrid Multistage Delivery with Weave GitOps on EKSWeaveworks

The Story of Flux Reaching Graduation in the CNCFWeaveworks

Shift Deployment Security Left with Weave GitOps & Upbound’s Universal Crossp...Weaveworks

Securing Your App Deployments with Tunnels, OIDC, RBAC, and Progressive Deliv...Weaveworks

Flux’s Security & Scalability with OCI & Helm Slides.pdfWeaveworks

Flux Security & Scalability using VS Code GitOps Extension Weaveworks

Deploying Stateful Applications Securely & Confidently with Ondat & Weave GitOpsWeaveworks

Robust Network Security and Observability with GitOps and CiliumWeaveworks

Intro to GitOps & Flux.pdfWeaveworks

Simplifying Hybrid Kubernetes with Weaveworks and EKS.pdfWeaveworks

Weave GitOps 2022.09 Release: A Fast & Reliable Path to Production with Progr...Weaveworks

Building a Security First Approach Across Hybrid Cloud with GitOps and Policy...Weaveworks

Security & Resiliency of Cloud Native Apps with Weave GitOps & Tetrate Servic...Weaveworks

Plus de Weaveworks (20)

Flux Beyond Git Harnessing the Power of OCI

Automated Provisioning, Management & Cost Control for Kubernetes Clusters

How to Avoid Kubernetes Multi-tenancy Catastrophes

Building internal developer platform with EKS and GitOps

GitOps Testing in Kubernetes with Flux and Testkube.pdf

Intro to GitOps with Weave GitOps, Flagger and Linkerd

Implementing Flux for Scale with Soft Multi-tenancy

Accelerating Hybrid Multistage Delivery with Weave GitOps on EKS

The Story of Flux Reaching Graduation in the CNCF

Shift Deployment Security Left with Weave GitOps & Upbound’s Universal Crossp...

Securing Your App Deployments with Tunnels, OIDC, RBAC, and Progressive Deliv...

Flux’s Security & Scalability with OCI & Helm Slides.pdf

Flux Security & Scalability using VS Code GitOps Extension

Deploying Stateful Applications Securely & Confidently with Ondat & Weave GitOps

Robust Network Security and Observability with GitOps and Cilium

Intro to GitOps & Flux.pdf

Simplifying Hybrid Kubernetes with Weaveworks and EKS.pdf

Weave GitOps 2022.09 Release: A Fast & Reliable Path to Production with Progr...

Building a Security First Approach Across Hybrid Cloud with GitOps and Policy...

Security & Resiliency of Cloud Native Apps with Weave GitOps & Tetrate Servic...

Dernier

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia

Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1

Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2

Boost PC performance: How more available memory can improve productivityPrincipled Technologies

Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech

The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

🐬 The future of MySQL is Postgres 🐘RTylerCroy

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge

Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j

How to convert PDF to text with Nanonetsnaman860154

A Call to Action for Generative AI in 2024Results

08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung

Real Time Object Detection Using Open CVKhem

A Year of the Servo Reboot: Where Are We Now?Igalia

Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal

A Domino Admins Adventures (Engage 2024)Gabriella Davis

08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls

Dernier (20)

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...

Boost Fertility New Invention Ups Success Rates.pdf

Exploring the Future Potential of AI-Enabled Smartphone Processors

Boost PC performance: How more available memory can improve productivity

Advantages of Hiring UIUX Design Service Providers for Your Business

The 7 Things I Know About Cyber Security After 25 Years | April 2024

How to Troubleshoot Apps for the Modern Connected Worker

🐬 The future of MySQL is Postgres 🐘

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf

Breaking the Kubernetes Kill Chain: Host Path Mount

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...

How to convert PDF to text with Nanonets

A Call to Action for Generative AI in 2024

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

Real Time Object Detection Using Open CV

A Year of the Servo Reboot: Where Are We Now?

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf

A Domino Admins Adventures (Engage 2024)

08448380779 Call Girls In Friends Colony Women Seeking Men

Interactive Monitoring for Kubernetes

1. Interactive Monitoring (for Kubernetes) tom@weave.works @tom_wilkie

3. VisualisationMonitoring Tracing 0 25 50 75 100

4. Monitoring 0 25 50 75 100

5. Traditional 3-tier architecture Incoming traffic Load balancers Application servers Database & replica

6. Microservice architecture Public APIWeb UI NoSQL serversDatabase Message Broker Services

8. Microservices should be treated like cattle not pets

10. USE Method* - for every resource, check: • utilization, • saturation, and • errors RED Method - for every service, check request: • rate, • error (rate), and • duration (distributions) * http://www.brendangregg.com/usemethod.html An alternative view

11. Okay, but how? var rpcDurations = prometheus.NewHistogram(prometheus.HistogramOpts{ Name: "rpc_durations_histogram_microseconds", Help: "RPC latency distributions.", Buckets: prometheus.LinearBuckets(0, 100, 20), }) func init() { prometheus.MustRegister(rpcDurations) } func handleRequest(w http.ResponseWriter, r *http.Request) { begin := time.Now() ... rpcDurations.WithLabelValues(r.Method).Observe( float64(time.Since(begin).Nanoseconds())) }

12.

13. There must be a better way… Kubeproxy Replicas incoming traffic from other services

14. Demo Time

15. https://github.com/weaveworks/flux Try it out!

16. Monitoring 0 25 50 75 100

17. Visualisation

18.

19. Weave Scope

20. Connection Tracking /home/weave # conntrack -E [DESTROY] tcp 6 src=172.17.0.10 dst=10.128.0.1 sport=41066 dport=80 src=172.17.0.1 dst=172.17.0.10 [DESTROY] tcp 6 src=192.168.99.100 dst=192.168.99.100 sport=36236 dport=32778 src=172.17.0.8 dst=1 [DESTROY] tcp 6 src=172.17.0.10 dst=10.128.0.1 sport=41068 dport=80 src=172.17.0.1 dst=172.17.0.10 [DESTROY] tcp 6 src=192.168.99.100 dst=192.168.99.100 sport=52996 dport=32776 src=172.17.0.6 dst=1 [DESTROY] tcp 6 src=172.17.0.10 dst=10.128.0.1 sport=41070 dport=80 src=172.17.0.1 dst=172.17.0.10 [DESTROY] tcp 6 src=192.168.99.100 dst=192.168.99.100 sport=52998 dport=32776 src=172.17.0.6 dst=1 [DESTROY] tcp 6 src=172.17.0.10 dst=10.128.0.1 sport=41072 dport=80 src=172.17.0.1 dst=172.17.0.10 [DESTROY] tcp 6 src=192.168.99.100 dst=192.168.99.100 sport=57975 dport=32777 src=172.17.0.7 dst=1 [DESTROY] tcp 6 src=172.17.0.10 dst=10.128.0.1 sport=41074 dport=80 src=172.17.0.1 dst=172.17.0.10 /home/weave # cat /proc/net/tcp sl local_address rem_address st tx_queue rx_queue tr tm->when retrnsmt uid timeout inode 0: 00000000:0016 00000000:0000 0A 00000000:00000000 00:00000000 00000000 0 0 16810 1 ffff 1: 0100007F:EB74 0100007F:0FC8 06 00000000:00000000 03:0000016D 00000000 0 0 0 3 ffff8800 2: 0100007F:EB69 0100007F:0FC8 01 00000000:00000000 00:00000000 00000000 0 0 307011 1 fff 3: 0100007F:EB7B 0100007F:0FC8 06 00000000:00000000 03:00000D27 00000000 0 0 0 3 ffff8800 4: 0100007F:EB7C 0100007F:0FC8 06 00000000:00000000 03:0000110E 00000000 0 0 0 3 ffff8800 5: 0100007F:EB67 0100007F:0FC8 01 00000000:00000000 00:00000000 00000000 0 0 306868 1 fff 6: 0100007F:EB76 0100007F:0FC8 06 00000000:00000000 03:00000556 00000000 0 0 0 3 ffff8800 7: 0100007F:EB7F 0100007F:0FC8 06 00000000:00000000 03:000014F7 00000000 0 0 0 3 ffff8800

21. all connections from /proc conntrack Connection Tracking load balanced

22. Demo Time

23. https://github.com/weaveworks/scope Try it out! https://weave-scope-slack.herokuapp.com v0.13.1 https://www.weave.works/products/weave-scope https://scope.weave.works

24. What does this look like with Kubernetes?

25. Visualisation

26. Tracing

27. Distributed Tracing

28. Not a new topic • Lots of literature • Existing open source projects • e.g. Zipkin, originally from Twitter

29. • Challenge: detecting causality between incoming and outgoing requests • Existing solutions require propagation of some unique ID (dapper, zipkin) • This requires application- specific modifications some service incoming request outgoing requests ?

30. Can this be done without application modifications?

31. By intercepting systems calls, build up a data structure of: • which threads are reading to / writing from which FDs • which FDs are talking to which IPs & ports Use this to infer causality between incoming and outgoing connections. some service kernel ? System calls

32. Demo Time

33. Try it out! https://github.com/weaveworks/scope/tree/master/experimental/tracer

34. Tracing

35. VisualisationMonitoring Tracing 0 25 50 75 100

36. @weaveworks github.com/weaveworks Questions? tom@weave.works @tom_wilkie

37. https://weave-scope-slack.herokuapp.com https://github.com/weaveworks/scope https://github.com/weaveworks/flux https://scope.weave.works Links https://github.com/weaveworks/scope/tree/master/experimental/tracer