End-to-end Monitoring
with the Prometheus Operator
By @mxinden
Max Inden
Test-Engineer at CoreOS
@mxinden
Max.Inden@CoreOS.com
Secure, simplify and automate container
infrastructure
Secure, simplify and automate container
infrastructure
Secure, simplify and automate container
infrastructure
Secure, simplify and automate container
infrastructure
Why Monitoring?
Why Monitoring?
Alerting
Why Monitoring?
Long-term trendsAlerting
What is Prometheus?
● Open Source Monitoring
● Built by Soundcloud
● Inspired by borgmon
●
What is Prometheus?
● Pull-based
●
What is Prometheus?
● Pull-based
● Multi-Dimensional
●
What is Prometheus?
● Pull-based
● Multi-Dimensional
● Metrics, not logging, not tracing
●
What is Prometheus?
● Pull-based
● Multi-Dimensional
● Metrics, not logging, not tracing
● No magic!
●
Target
Target
Target
Target /metrics
Target /metrics
Target /metrics
Prometheus
Target /metrics
Target /metrics
Target /metrics
Prometheus
Target /metrics
Target /metrics
Target /metrics
15s
Target /metrics
# HELP http_requests_total Total number of HTTP requests made.
# TYPE http_requests_total counter
http_req...
Target /metrics
# HELP http_requests_total Total number of HTTP requests made.
# TYPE http_requests_total counter
http_req...
Target /metrics
# HELP http_requests_total Total number of HTTP requests made.
# TYPE http_requests_total counter
http_req...
Target /metrics
# HELP http_requests_total Total number of HTTP requests made.
# TYPE http_requests_total counter
http_req...
Prometheus
Target /metrics
Target /metrics
Target /metrics
Prometheus
Target /metrics
Target /metrics
Target /metrics
PromQL
Current percentage of HTTP errors across all service instances?
Current percentage of HTTP errors across all service instances?
sum by(path) rate(http_requests_total{status="500"}[5m]))
...
Current percentage of HTTP errors across all service instances?
{path="/status"} 0.0039
{path="/"} 0.0011
{path="/api/v1/t...
Prometheus
Target /metrics
Target /metrics
Target /metrics
PromQL
Prometheus
Target /metrics
Target /metrics
Target /metrics
PromQL
Web UI Dashboard
Prometheus
Target /metrics
Target /metrics
Target /metrics
Prometheus
Target /metrics
Target /metrics
Target /metrics
Alert Definition
ALERT DiskWillFillIn4Hours
IF predict_linear(node_filesystem_free[1h], 4*3600) < 0
Is any disk about to run full within 4 ...
Prometheus
Target /metrics
Target /metrics
Target /metrics
Alert Definition
Prometheus
Target /metrics
Target /metrics
Target /metrics
Alert Definition
1m
Prometheus
Target /metrics
Target /metrics
Target /metrics
Alert Definition
1m
Prometheus
Target /metrics
Target /metrics
Target /metrics
Alert Definition
Alertmanager
1m
Alertmanager
Deduplicates
Alert
Alert
Alert
Alert
Alert
Alert
Alert
Alertmanager
Deduplicates
Alert
Alert
Alert
Alert
Alert
Alert
Alert
Groups
Alert
Alert
Alert
Alert
Alert
Alert
Alertmanager
Deduplicates
Alert
Alert
Alert
Alert
Alert
Alert
Alert
Groups
Alert
Alert
Alert
Alert
Alert
Alert
Routes
Aler...
Alertmanager
Deduplicates
Alert
Alert
Alert
Alert
Alert
Alert
Alert
Groups
Alert
Alert
Alert
Alert
Alert
Alert
Routes
Aler...
Prometheus
Target /metrics
Target /metrics
Target /metrics
Alertmanager
Prometheus
Target /metrics
Target /metrics
Target /metrics
Alertmanager
Monitoring
Application Cluster
Monitoring
Cluster Monitoring
What is Kubernetes?
Platform for running
containerized applications
What is Kubernetes?
Announced 2014 by Google
Influenced by Borg & Omega
v1.01 in July 2015
Kubernetes joins the CNCF
Master
Master
API-Server
etcd
Controller-Manager
Scheduler
Kube-DNS
...
Master
API-Server
etcd
Controller-Manager
Scheduler
Kube-DNS
...
Worker
Master
API-Server
etcd
Controller-Manager
Scheduler
Kube-DNS
...
Worker
Kubelet
Kube-Proxy
...
Application Monitoring
Location
User
AppX
Location
User
AppX
User
AppX
Location
Location
User
AppX
User
AppX
Location
Service
Service
Service
Location
User
AppX
User
AppX
Location
Service
Service
Service
Prometheus
Location
User
AppX
User
AppX
Location
Service
Service
Service
Prometheus
?
K8s-API-Server
Location
User
AppX
User
AppX
Location
Service
Service
Service
Prometheus
Location
User
AppX
User
AppX
Location
Service
Service
Service
Prometheus
K8s-API-Server
Service Discovery
● Static target list
● DNS discovery
● Kubernetes discovery
● ...
Master
API-Server
etcd
Controller-Manager
Scheduler
Kube-DNS
...
Worker
Kubelet
Kube-Proxy
...
Location
User
AppX
User
App...
Problem
Prometheus is stateful and difficult to
configure!
Introducing the
Prometheus Operator
What is a K8s Operator?
What is a K8s Operator?
Application specific
operational knowledge
What is a K8s Operator?
What is a K8s Operator?
</>
What is a K8s Operator?
</>
What is a K8s Operator?
</>
Operator
Prometheus Operator
● Kubernetes native configuration
● Automated management and upgrades
of Prometheus & Alertmanager
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: my-app
spec:
...
apiVersion: monitoring.coreos.com/v1alpha1
kind: Prometheus
metadata:
name: prometheus-k8s
spec:
...
Kube-Prometheus
Single command to install:
● Prometheus & Alertmanager Cluster
● Alerting rules
● Dashboarding
Demo
Recap
What is Prometheus?
● Pull-based
● Multi-Dimensional
● Metrics, not logging, not tracing
● No magic!
●
Prometheus
Target /metrics
Target /metrics
Target /metrics
15s
Prometheus
Target /metrics
Target /metrics
Target /metrics
Alert Definition
Alertmanager
1m
Prometheus-Operator & Kube-Prometheus
</>
Operator
Where to go from here?
Prometheus.io
/coreos/prometheus-operator
San Francisco, New York & Berlin
We are hiring!
Max Inden
Test-Engineer at CoreOS
@mxinden
Max.Inden@CoreOS.com
End to-end monitoring with the prometheus operator - Max Inden
End to-end monitoring with the prometheus operator - Max Inden
End to-end monitoring with the prometheus operator - Max Inden
End to-end monitoring with the prometheus operator - Max Inden
Prochain SlideShare
Chargement dans…5
×

End to-end monitoring with the prometheus operator - Max Inden

518 vues

Publié le

Retrouvez la présentation de Max Inden de Core OS lors du Paris Container Day

Publié dans : Technologie
0 commentaire
0 j’aime
Statistiques
Remarques
  • Soyez le premier à commenter

  • Soyez le premier à aimer ceci

Aucun téléchargement
Vues
Nombre de vues
518
Sur SlideShare
0
Issues des intégrations
0
Intégrations
148
Actions
Partages
0
Téléchargements
23
Commentaires
0
J’aime
0
Intégrations 0
Aucune incorporation

Aucune remarque pour cette diapositive

End to-end monitoring with the prometheus operator - Max Inden

  1. 1. End-to-end Monitoring with the Prometheus Operator By @mxinden
  2. 2. Max Inden Test-Engineer at CoreOS @mxinden Max.Inden@CoreOS.com
  3. 3. Secure, simplify and automate container infrastructure
  4. 4. Secure, simplify and automate container infrastructure
  5. 5. Secure, simplify and automate container infrastructure
  6. 6. Secure, simplify and automate container infrastructure
  7. 7. Why Monitoring?
  8. 8. Why Monitoring? Alerting
  9. 9. Why Monitoring? Long-term trendsAlerting
  10. 10. What is Prometheus? ● Open Source Monitoring ● Built by Soundcloud ● Inspired by borgmon ●
  11. 11. What is Prometheus? ● Pull-based ●
  12. 12. What is Prometheus? ● Pull-based ● Multi-Dimensional ●
  13. 13. What is Prometheus? ● Pull-based ● Multi-Dimensional ● Metrics, not logging, not tracing ●
  14. 14. What is Prometheus? ● Pull-based ● Multi-Dimensional ● Metrics, not logging, not tracing ● No magic! ●
  15. 15. Target Target Target
  16. 16. Target /metrics Target /metrics Target /metrics
  17. 17. Prometheus Target /metrics Target /metrics Target /metrics
  18. 18. Prometheus Target /metrics Target /metrics Target /metrics 15s
  19. 19. Target /metrics # HELP http_requests_total Total number of HTTP requests made. # TYPE http_requests_total counter http_requests_total{code="200",path="/status"} 8
  20. 20. Target /metrics # HELP http_requests_total Total number of HTTP requests made. # TYPE http_requests_total counter http_requests_total{code="200",path="/status"} 8 Metric name
  21. 21. Target /metrics # HELP http_requests_total Total number of HTTP requests made. # TYPE http_requests_total counter http_requests_total{code="200",path="/status"} 8 Label
  22. 22. Target /metrics # HELP http_requests_total Total number of HTTP requests made. # TYPE http_requests_total counter http_requests_total{code="200",path="/status"} 8 Value
  23. 23. Prometheus Target /metrics Target /metrics Target /metrics
  24. 24. Prometheus Target /metrics Target /metrics Target /metrics PromQL
  25. 25. Current percentage of HTTP errors across all service instances?
  26. 26. Current percentage of HTTP errors across all service instances? sum by(path) rate(http_requests_total{status="500"}[5m])) / sum by(path) rate(http_requests_total[5m]))
  27. 27. Current percentage of HTTP errors across all service instances? {path="/status"} 0.0039 {path="/"} 0.0011 {path="/api/v1/topics/:topic"} 0.087 {path="/api/v1/topics} 0.0342 sum by(path) rate(http_requests_total{status="500"}[5m])) / sum by(path) rate(http_requests_total[5m]))
  28. 28. Prometheus Target /metrics Target /metrics Target /metrics PromQL
  29. 29. Prometheus Target /metrics Target /metrics Target /metrics PromQL Web UI Dashboard
  30. 30. Prometheus Target /metrics Target /metrics Target /metrics
  31. 31. Prometheus Target /metrics Target /metrics Target /metrics Alert Definition
  32. 32. ALERT DiskWillFillIn4Hours IF predict_linear(node_filesystem_free[1h], 4*3600) < 0 Is any disk about to run full within 4 hours? 0 now-1h +4h
  33. 33. Prometheus Target /metrics Target /metrics Target /metrics Alert Definition
  34. 34. Prometheus Target /metrics Target /metrics Target /metrics Alert Definition 1m
  35. 35. Prometheus Target /metrics Target /metrics Target /metrics Alert Definition 1m
  36. 36. Prometheus Target /metrics Target /metrics Target /metrics Alert Definition Alertmanager 1m
  37. 37. Alertmanager Deduplicates Alert Alert Alert Alert Alert Alert Alert
  38. 38. Alertmanager Deduplicates Alert Alert Alert Alert Alert Alert Alert Groups Alert Alert Alert Alert Alert Alert
  39. 39. Alertmanager Deduplicates Alert Alert Alert Alert Alert Alert Alert Groups Alert Alert Alert Alert Alert Alert Routes Alert Alert Alert Alert Alert Team A Team B Team C
  40. 40. Alertmanager Deduplicates Alert Alert Alert Alert Alert Alert Alert Groups Alert Alert Alert Alert Alert Alert Routes Alert Alert Alert Alert Alert Team A Team B Team C
  41. 41. Prometheus Target /metrics Target /metrics Target /metrics Alertmanager
  42. 42. Prometheus Target /metrics Target /metrics Target /metrics Alertmanager
  43. 43. Monitoring
  44. 44. Application Cluster Monitoring
  45. 45. Cluster Monitoring
  46. 46. What is Kubernetes? Platform for running containerized applications
  47. 47. What is Kubernetes? Announced 2014 by Google Influenced by Borg & Omega v1.01 in July 2015 Kubernetes joins the CNCF
  48. 48. Master
  49. 49. Master API-Server etcd Controller-Manager Scheduler Kube-DNS ...
  50. 50. Master API-Server etcd Controller-Manager Scheduler Kube-DNS ... Worker
  51. 51. Master API-Server etcd Controller-Manager Scheduler Kube-DNS ... Worker Kubelet Kube-Proxy ...
  52. 52. Application Monitoring
  53. 53. Location User AppX
  54. 54. Location User AppX User AppX Location
  55. 55. Location User AppX User AppX Location Service Service Service
  56. 56. Location User AppX User AppX Location Service Service Service Prometheus
  57. 57. Location User AppX User AppX Location Service Service Service Prometheus ?
  58. 58. K8s-API-Server Location User AppX User AppX Location Service Service Service Prometheus
  59. 59. Location User AppX User AppX Location Service Service Service Prometheus K8s-API-Server
  60. 60. Service Discovery ● Static target list ● DNS discovery ● Kubernetes discovery ● ...
  61. 61. Master API-Server etcd Controller-Manager Scheduler Kube-DNS ... Worker Kubelet Kube-Proxy ... Location User AppX User AppX Location Service Service Service Prometheus K8s-API-Server Application-MonitoringCluster-Monitoring
  62. 62. Problem Prometheus is stateful and difficult to configure!
  63. 63. Introducing the Prometheus Operator
  64. 64. What is a K8s Operator?
  65. 65. What is a K8s Operator? Application specific operational knowledge
  66. 66. What is a K8s Operator?
  67. 67. What is a K8s Operator? </>
  68. 68. What is a K8s Operator? </>
  69. 69. What is a K8s Operator? </> Operator
  70. 70. Prometheus Operator ● Kubernetes native configuration ● Automated management and upgrades of Prometheus & Alertmanager
  71. 71. apiVersion: extensions/v1beta1 kind: Deployment metadata: name: my-app spec: ...
  72. 72. apiVersion: monitoring.coreos.com/v1alpha1 kind: Prometheus metadata: name: prometheus-k8s spec: ...
  73. 73. Kube-Prometheus Single command to install: ● Prometheus & Alertmanager Cluster ● Alerting rules ● Dashboarding
  74. 74. Demo
  75. 75. Recap
  76. 76. What is Prometheus? ● Pull-based ● Multi-Dimensional ● Metrics, not logging, not tracing ● No magic! ●
  77. 77. Prometheus Target /metrics Target /metrics Target /metrics 15s
  78. 78. Prometheus Target /metrics Target /metrics Target /metrics Alert Definition Alertmanager 1m
  79. 79. Prometheus-Operator & Kube-Prometheus </> Operator
  80. 80. Where to go from here? Prometheus.io /coreos/prometheus-operator
  81. 81. San Francisco, New York & Berlin We are hiring!
  82. 82. Max Inden Test-Engineer at CoreOS @mxinden Max.Inden@CoreOS.com

×