SlideShare une entreprise Scribd logo
1  sur  24
Télécharger pour lire hors ligne
Automating the Configuration of
Monitoring on Large Infrastructures
How monitoring of dynamic infrastructures at scale can be made easier with
Uyuni, Prometheus and Grafana
João Cavalheiro, Engineering Manager – jcavalheiro@suse.com
Johannes Renner, Engineering Manager – jrenner@suse.com
Managing IT Infrastructures is hard
● In most companies, the IT landscape is diverse and complex
● ...And nearly impossible to manage beyond a certain scale without
automation
● Modern application stacks are multi-modal: VMs and containers
spread across private and public clouds
● Different operating systems have different requirements
● Many companies require reporting and compliance
● Security is a concern
2
Enter Uyuni
Uyuni is an open-source solution for managing Linux infrastructure
● Can save you time and headaches when you have to manage and
update tens, hundreds or even thousands of machines
● Mass-deploy patches and packages based on software channels
● Consistent and repeatable provisioning and configuration of bare
metal, VMs and containers
● Automates configuration of monitoring with Prometheus and
Grafana
3
Origins: Spacewalk
● Open-source systems management solution
● Upstream for Red Hat Satellite 5, around since 2008
● Supported managing of Fedora, CentOS and Debian
● Adopted by SUSE as upstream for SUSE Manager
● Satellite 6 was built on different technologies:
∙ Spacewalk entered maintenance mode
∙ Only bugfixes, no plans for the future
∙ Many patches pending to implement modernizations!
4
Uyuni
/uju:ˈni/
“Salar de Uyuni” is the world's largest salt flat*
Image: https://www.flickr.com/photos/madeleine_h/9468953452/
Attribution-ShareAlike 2.0 Generic (CC BY-SA 2.0)
* https://en.wikipedia.org/wiki/Salar_de_Uyuni
What is Salt?
● Open-source software for remote task execution and (descriptive)
configuration management
● Works on almost any platform - only Python is needed
● Typically requires an agent (minion) that connects to a master
● ZeroMQ used as default transport
● Event-driven architecture supporting automation
● Scalable, extensible and customizable
6
Salt Concepts
7
Uyuni: An Opinionated Fork of Spacewalk
● New backend based on Salt
● Modernized codebase (React.js, Python 3, JDK11)
● Content lifecycle management
● Container image building and Kubernetes integration
● Improved virtualization management
● Monitoring automation based on Prometheus & Grafana
8
Monitoring 101
9
Getting started with metrics
Main data source for alerting and visualization:
● Starting point for troubleshooting
∙ "Something looks wrong on this dashboard"
∙ Used as Service Level Indicators
● How available are we to the outside world?
∙ What are our customers experiencing?
Good metrics help to eliminate hypothesis before you investigate them.
10
About Prometheus
● Originally built at SoundCloud
● Has its own time-series database
● Data collection via pull model over HTTP
● Targets are set via static configuration or service discovery
● Metrics have a name, a set of labels, a timestamp and a value
11
Exposing Metrics
● Each application/system we want to monitor must expose metrics
● Instrumentation vs. exporters
When the metrics endpoint is embedded in an existing application it is
referred to as instrumentation.
● Extensive list of Prometheus exporters
∙ https://prometheus.io/docs/instrumenting/exporters/
∙ Node exporter is one of the most widely used
● Easy to build your own exporters
∙ You can monitor almost anything
12
Querying Metrics
● Prometheus has its own query language - PromQL
∙ PromQL is a functional expression language
∙ Allows to easily filter multidimensional time-series
● Example: HTTP internal server errors per second.. an hour ago
∙ rate(api_http_requests_total{status=500}[5m] offset 1h)
● Regex matching
∙ up{instance=~"web-server-.*"} == 0
● Used in all interactions with Prometheus (visualization, alerts)
13
Alerts
● Prometheus has its own alerting system – Alertmanager
∙ Takes care of deduplication, grouping, and routing
● Alerting rules are written in PromQL
● Supports HA setups
● Integration with email, PagerDuty and OpsGenie
● HTTP API and CLI tool: amtool
∙ Can be “plugged” into your existing scripts
14
Grafana
● Used to query and visualize metrics
● Works with Prometheus, but not only
∙ Grafana supports multiple backends
∙ It is possible to combine data from different sources in the same
dashboard
● Fully customizable
∙ Each panel has a wide variety of styling and formatting options
∙ Supports templates
∙ Collection of add-ons and pre-built dashboards
15
How to Get Started?
● Which components do I need to install?
● How to configure Prometheus and Grafana?
● How to configure my systems to expose their metrics?
● How do I get started with building dashboards?
16
Monitoring at Scale
Common data centers go beyond thousands of machines
● Different system types (physical, VMs, containers)
● Different operating systems
● A lot of different metrics from different sources
● What can be automated?
It’s not practical to manually maintain configuration files for all this
diversity!
17
Putting the Pieces Together
18
Uyuni Meets Monitoring
Automate Prometheus Monitoring with Uyuni
19
Uyuni Meets Monitoring
Single Pane of Glass for Monitoring Configuration
● Provisioning and configuration of Prometheus and Grafana
● Pre-built Grafana dashboards
● Enable exporters on managed clients using Salt Formulas
● Group systems to create common configurations
● Prometheus service discovery
● Reproducible setups
20
Live Demo
21
Coming next
● Support for Prometheus federations
● Improve the existing automation (e.g. more exporters), including:
● cadvisor for Docker containers
● libvirt exporter for KVM hypervisors
● kubernetes
● blackbox exporter
● Alerting templates
● Authentication and TLS encryption
● Automated firewall configuration
22
Questions?
23
https://www.uyuni-project.org/
github.com/uyuni-project
@UyuniProject
uyuni-announce+subscribe@opensuse.org
#uyuni @ irc.freenode.org
Thank you!

Contenu connexe

Similaire à OSMC 2019 | Automating the conficuration of Monitoring on Large Infrastructures by João Cavalheiro

Similaire à OSMC 2019 | Automating the conficuration of Monitoring on Large Infrastructures by João Cavalheiro (20)

Prometheus
PrometheusPrometheus
Prometheus
 
Build cloud native solution using open source
Build cloud native solution using open source Build cloud native solution using open source
Build cloud native solution using open source
 
Uyuni, the movie
Uyuni, the movieUyuni, the movie
Uyuni, the movie
 
MuleSoft Surat Virtual Meetup#16 - Anypoint Deployment Option, API and Operat...
MuleSoft Surat Virtual Meetup#16 - Anypoint Deployment Option, API and Operat...MuleSoft Surat Virtual Meetup#16 - Anypoint Deployment Option, API and Operat...
MuleSoft Surat Virtual Meetup#16 - Anypoint Deployment Option, API and Operat...
 
Prometheus for Monitoring Metrics (Percona Live Europe 2017)
Prometheus for Monitoring Metrics (Percona Live Europe 2017)Prometheus for Monitoring Metrics (Percona Live Europe 2017)
Prometheus for Monitoring Metrics (Percona Live Europe 2017)
 
System monitoring
System monitoringSystem monitoring
System monitoring
 
Getting Started Monitoring with Prometheus and Grafana
Getting Started Monitoring with Prometheus and GrafanaGetting Started Monitoring with Prometheus and Grafana
Getting Started Monitoring with Prometheus and Grafana
 
Prometheus (Microsoft, 2016)
Prometheus (Microsoft, 2016)Prometheus (Microsoft, 2016)
Prometheus (Microsoft, 2016)
 
Monitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInDataMonitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInData
 
Prometheus for Monitoring Metrics (Fermilab 2018)
Prometheus for Monitoring Metrics (Fermilab 2018)Prometheus for Monitoring Metrics (Fermilab 2018)
Prometheus for Monitoring Metrics (Fermilab 2018)
 
Prometheus and Grafana
Prometheus and GrafanaPrometheus and Grafana
Prometheus and Grafana
 
PCF2.2 update mkim_201807
PCF2.2 update mkim_201807PCF2.2 update mkim_201807
PCF2.2 update mkim_201807
 
Prometheus Overview
Prometheus OverviewPrometheus Overview
Prometheus Overview
 
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
 
Open shift and docker - october,2014
Open shift and docker - october,2014Open shift and docker - october,2014
Open shift and docker - october,2014
 
DevOps Spain 2019. Beatriz Martínez-IBM
DevOps Spain 2019. Beatriz Martínez-IBMDevOps Spain 2019. Beatriz Martínez-IBM
DevOps Spain 2019. Beatriz Martínez-IBM
 
Kick starting Network Automation
Kick starting Network AutomationKick starting Network Automation
Kick starting Network Automation
 
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)
 
Controlled Evolution with Puppet and AWS
Controlled Evolution with Puppet and AWSControlled Evolution with Puppet and AWS
Controlled Evolution with Puppet and AWS
 
Mulesoft Meetup Milano #9 - Batch Processing and CI/CD
Mulesoft Meetup Milano #9 - Batch Processing and CI/CDMulesoft Meetup Milano #9 - Batch Processing and CI/CD
Mulesoft Meetup Milano #9 - Batch Processing and CI/CD
 

Dernier

AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
VictorSzoltysek
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
shinachiaurasa2
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
masabamasaba
 

Dernier (20)

Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
Harnessing ChatGPT - Elevating Productivity in Today's Agile Environment
Harnessing ChatGPT  - Elevating Productivity in Today's Agile EnvironmentHarnessing ChatGPT  - Elevating Productivity in Today's Agile Environment
Harnessing ChatGPT - Elevating Productivity in Today's Agile Environment
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 

OSMC 2019 | Automating the conficuration of Monitoring on Large Infrastructures by João Cavalheiro

  • 1. Automating the Configuration of Monitoring on Large Infrastructures How monitoring of dynamic infrastructures at scale can be made easier with Uyuni, Prometheus and Grafana João Cavalheiro, Engineering Manager – jcavalheiro@suse.com Johannes Renner, Engineering Manager – jrenner@suse.com
  • 2. Managing IT Infrastructures is hard ● In most companies, the IT landscape is diverse and complex ● ...And nearly impossible to manage beyond a certain scale without automation ● Modern application stacks are multi-modal: VMs and containers spread across private and public clouds ● Different operating systems have different requirements ● Many companies require reporting and compliance ● Security is a concern 2
  • 3. Enter Uyuni Uyuni is an open-source solution for managing Linux infrastructure ● Can save you time and headaches when you have to manage and update tens, hundreds or even thousands of machines ● Mass-deploy patches and packages based on software channels ● Consistent and repeatable provisioning and configuration of bare metal, VMs and containers ● Automates configuration of monitoring with Prometheus and Grafana 3
  • 4. Origins: Spacewalk ● Open-source systems management solution ● Upstream for Red Hat Satellite 5, around since 2008 ● Supported managing of Fedora, CentOS and Debian ● Adopted by SUSE as upstream for SUSE Manager ● Satellite 6 was built on different technologies: ∙ Spacewalk entered maintenance mode ∙ Only bugfixes, no plans for the future ∙ Many patches pending to implement modernizations! 4
  • 5. Uyuni /uju:ˈni/ “Salar de Uyuni” is the world's largest salt flat* Image: https://www.flickr.com/photos/madeleine_h/9468953452/ Attribution-ShareAlike 2.0 Generic (CC BY-SA 2.0) * https://en.wikipedia.org/wiki/Salar_de_Uyuni
  • 6. What is Salt? ● Open-source software for remote task execution and (descriptive) configuration management ● Works on almost any platform - only Python is needed ● Typically requires an agent (minion) that connects to a master ● ZeroMQ used as default transport ● Event-driven architecture supporting automation ● Scalable, extensible and customizable 6
  • 8. Uyuni: An Opinionated Fork of Spacewalk ● New backend based on Salt ● Modernized codebase (React.js, Python 3, JDK11) ● Content lifecycle management ● Container image building and Kubernetes integration ● Improved virtualization management ● Monitoring automation based on Prometheus & Grafana 8
  • 10. Getting started with metrics Main data source for alerting and visualization: ● Starting point for troubleshooting ∙ "Something looks wrong on this dashboard" ∙ Used as Service Level Indicators ● How available are we to the outside world? ∙ What are our customers experiencing? Good metrics help to eliminate hypothesis before you investigate them. 10
  • 11. About Prometheus ● Originally built at SoundCloud ● Has its own time-series database ● Data collection via pull model over HTTP ● Targets are set via static configuration or service discovery ● Metrics have a name, a set of labels, a timestamp and a value 11
  • 12. Exposing Metrics ● Each application/system we want to monitor must expose metrics ● Instrumentation vs. exporters When the metrics endpoint is embedded in an existing application it is referred to as instrumentation. ● Extensive list of Prometheus exporters ∙ https://prometheus.io/docs/instrumenting/exporters/ ∙ Node exporter is one of the most widely used ● Easy to build your own exporters ∙ You can monitor almost anything 12
  • 13. Querying Metrics ● Prometheus has its own query language - PromQL ∙ PromQL is a functional expression language ∙ Allows to easily filter multidimensional time-series ● Example: HTTP internal server errors per second.. an hour ago ∙ rate(api_http_requests_total{status=500}[5m] offset 1h) ● Regex matching ∙ up{instance=~"web-server-.*"} == 0 ● Used in all interactions with Prometheus (visualization, alerts) 13
  • 14. Alerts ● Prometheus has its own alerting system – Alertmanager ∙ Takes care of deduplication, grouping, and routing ● Alerting rules are written in PromQL ● Supports HA setups ● Integration with email, PagerDuty and OpsGenie ● HTTP API and CLI tool: amtool ∙ Can be “plugged” into your existing scripts 14
  • 15. Grafana ● Used to query and visualize metrics ● Works with Prometheus, but not only ∙ Grafana supports multiple backends ∙ It is possible to combine data from different sources in the same dashboard ● Fully customizable ∙ Each panel has a wide variety of styling and formatting options ∙ Supports templates ∙ Collection of add-ons and pre-built dashboards 15
  • 16. How to Get Started? ● Which components do I need to install? ● How to configure Prometheus and Grafana? ● How to configure my systems to expose their metrics? ● How do I get started with building dashboards? 16
  • 17. Monitoring at Scale Common data centers go beyond thousands of machines ● Different system types (physical, VMs, containers) ● Different operating systems ● A lot of different metrics from different sources ● What can be automated? It’s not practical to manually maintain configuration files for all this diversity! 17
  • 18. Putting the Pieces Together 18
  • 19. Uyuni Meets Monitoring Automate Prometheus Monitoring with Uyuni 19
  • 20. Uyuni Meets Monitoring Single Pane of Glass for Monitoring Configuration ● Provisioning and configuration of Prometheus and Grafana ● Pre-built Grafana dashboards ● Enable exporters on managed clients using Salt Formulas ● Group systems to create common configurations ● Prometheus service discovery ● Reproducible setups 20
  • 22. Coming next ● Support for Prometheus federations ● Improve the existing automation (e.g. more exporters), including: ● cadvisor for Docker containers ● libvirt exporter for KVM hypervisors ● kubernetes ● blackbox exporter ● Alerting templates ● Authentication and TLS encryption ● Automated firewall configuration 22