SlideShare une entreprise Scribd logo
1  sur  29
Télécharger pour lire hors ligne
KMG Group GmbH, http://www.kmggroup.ch
Magnus Lübeck, Zürich, 2019-11-12
http://kmg.group
Icinga Day Zürich 2019
2
Sysadmin since the 90’s
Unix/Oracle at Volvo
Pre sales, Sun Microsystems reseller
Oracle DBA at CERN
IT Operations manager at Accarda
IT Operations manager at Kanton LU
Owner of KMG Group GmbH
Built infrastructure and operations at
Swisscom
peaq
Serafe
This is me
3
Where I work
4
What we do
5
Quick overview
People, tools and processes
The four fielder
Telemetry and health
Desire lines
OSS and Free software in modern operations environments
Tool landscape
Icinga’s part in the mechano
Outline
6
The stack – game of tetris
7
dennisadams.net
Metrics
Operational tools
Processes
Standards
MOPS
8
Telemetry is part of good systems design
Measurement points should be a mandatory point of EVERY system
This has been know since many years, across many industries
Metrics - /status, /health
9
The use of waveforms to diagnose broken things is far from new.
The triangular form is particularly useful.
Can be used in many ways
Very useful for repetitive patterns.
Metrics - /status, /health
10
Metrics - /status, /health
11
Metrics - /status, /health
12
A fool with a tool is still a fool.
Get smart people
Use tools
Integrate the tools with your
environment.
Tools can cost money
But does not have to
Operational tools
13
Implement simple processes
Use the right tools, and don’t make
the processes complicated.
Processes - desire lines
14
Morningcheck ok
Processes – desire lines
15
Naming conventions
No servers named after
porn stars
Baseline installations
Mini OS install
Automation/ Infrastructure as code
Ansible, chef, puppet
Coding guidelines
Standards
16
Inception in so many levels
Deals with ”less than 24/7” SLAs
You can service check your SLA
Shameless plug – SLA check
17
The four fielder
Technical
Monitoring
Telemetry
Operational tools Inventories
Configuration
Management
Admin Gui
Orchestration
IAM
Ticketing
Dashboarding
Documentation
Remote
Access
Code repository
Artifact
Repository
Application
Specific
Tools
Application
Specific
Tools
Application
Specific
Tools
Application
Specific
Tools
SLA
Monitoring
Audience
Spectrum
18
Remote access
Systems monitoring
Documentation
Identity management
Ticketing
Inventory (not CMDB)
Automation/Orchestration
Telemetry (Technical performance monitoring)
Dashboarding
Technical tools (sysadmin toolbox)
SLA monitoring
Tool landscape
Technical
Monitoring
Telemetry
Operational tools
Inventories
Configuration
Management
Admin Gui
Orchestration
IAM
Ticketing
Dashboarding
Documentation
Remote
Access
Code repository
Artifact
Repository
Application
Specific
Tools
Application
Specific
Tools
Application
Specific
Tools
Application
Specific
Tools
SLA
Monitoring
Audience
Spectrum
19
A customer of mine had
8’500 Open Critical Alerts
15’300 Warnings
Typical “cry wolf” scenario
3 possible/allowed Actions
Solve the problem
Change the threshold (change the metric, template, standard)
Remove the alert
Monitoring theory:
Bad design reduces the value of your monitoring
20
Move the responsibility of delivering telemetry to the application
designers and the application owners
Help them learn how to write service checks
A service delivery is not complete unless telemetry and monitoring
packages are delivered
Application service check responsibility
devOps or stoneAgeOps?
21
Question from an auditor (ISO-27001 audit)
How do you ensure that all applications work after a patch run
My answer:
We don’t
The big audit monster
22
Monitoring – Icinga
Service
Checks
23
Audience
24
One stop shop icinga
Service
Checks
Application
Application
Application
Application
Application
Application
ApplicationScheduled
Tasks
Notifications
Signage
Raspberry Pi 4 with 2 screens
Darboard
Smashing
Telemetry and logging
25
Backup slides
26
Manually edit config – use it when you learn Icinga
Good ways to do it
Automate icinga centric configuration repository - director
Icinga API – write the integration yourself
Automation per Ansible
Metamonitoring
By using your inventory, you know what you are monitoring
And, what you are not monitoring
Icinga client and service registration
27
The layer cake is your monitoring standard grouped
by common denominators.
Group service checks in layers (i.e L0 – L5)
L0 – OS Level - (Linux admin)
CPU, disk usage, ssh, ping, fs usage {/, /var, /home}
L1 – Server type – shared OS resources (Linux Admin)
iops on db fs, fs usage on /app/ora
…
L5 – Application checks – (Application Managers)
Application specific checks
The Layered Cake
28
The human brain is excellent at identifying harmonies and regularities.
Ingredient number 2: Sawtooth waveform
29
The human brain is excellent at identifying harmonies and regularities.
Ingredient number 2: Sawtooth waveform

Contenu connexe

Tendances

Combinación de logs, métricas y rastreos para observabilidad unificada
Combinación de logs, métricas y rastreos para observabilidad unificadaCombinación de logs, métricas y rastreos para observabilidad unificada
Combinación de logs, métricas y rastreos para observabilidad unificadaElasticsearch
 
Paul Dix [InfluxData] | InfluxDays Keynote: Future of InfluxDB | InfluxDays N...
Paul Dix [InfluxData] | InfluxDays Keynote: Future of InfluxDB | InfluxDays N...Paul Dix [InfluxData] | InfluxDays Keynote: Future of InfluxDB | InfluxDays N...
Paul Dix [InfluxData] | InfluxDays Keynote: Future of InfluxDB | InfluxDays N...InfluxData
 
Turning Cloud Metrics into Results
Turning Cloud Metrics into ResultsTurning Cloud Metrics into Results
Turning Cloud Metrics into ResultsInfluxData
 
Building an event system on top MongoDB
Building an event system on top MongoDBBuilding an event system on top MongoDB
Building an event system on top MongoDBBigPanda
 
NetApp keynote for Openstack Silicon Valley 2015
NetApp keynote for Openstack Silicon Valley 2015NetApp keynote for Openstack Silicon Valley 2015
NetApp keynote for Openstack Silicon Valley 2015Val Bercovici
 
Study Notes - Architecting for the cloud (AWS Best Practices, Feb 2016)
Study Notes - Architecting for the cloud (AWS Best Practices, Feb 2016)Study Notes - Architecting for the cloud (AWS Best Practices, Feb 2016)
Study Notes - Architecting for the cloud (AWS Best Practices, Feb 2016)Rick Hwang
 
Top Considerations For Operating a Kubernetes Environment at Scale
Top Considerations For Operating a Kubernetes Environment at ScaleTop Considerations For Operating a Kubernetes Environment at Scale
Top Considerations For Operating a Kubernetes Environment at ScaleSignalFx
 
DevDay 2018: Martin Schurz - Aufbau einer Monitoringlösung für moderne Applik...
DevDay 2018: Martin Schurz - Aufbau einer Monitoringlösung für moderne Applik...DevDay 2018: Martin Schurz - Aufbau einer Monitoringlösung für moderne Applik...
DevDay 2018: Martin Schurz - Aufbau einer Monitoringlösung für moderne Applik...DevDay Dresden
 
Yannis Zarkadas. Enterprise data science workflows on kubeflow
Yannis Zarkadas. Enterprise data science workflows on kubeflowYannis Zarkadas. Enterprise data science workflows on kubeflow
Yannis Zarkadas. Enterprise data science workflows on kubeflowMarynaHoldaieva
 
APIdays Paris 2018 - Cloud computing - we went through every steps of the Gar...
APIdays Paris 2018 - Cloud computing - we went through every steps of the Gar...APIdays Paris 2018 - Cloud computing - we went through every steps of the Gar...
APIdays Paris 2018 - Cloud computing - we went through every steps of the Gar...apidays
 
PAM3: Machine Learning in the Railway Industry ( Predix Transform 2016)
PAM3: Machine Learning in the Railway Industry ( Predix Transform 2016)PAM3: Machine Learning in the Railway Industry ( Predix Transform 2016)
PAM3: Machine Learning in the Railway Industry ( Predix Transform 2016)Predix
 
Build A Better Way to Deliver IT
Build A Better Way to Deliver ITBuild A Better Way to Deliver IT
Build A Better Way to Deliver ITRackspace
 
10 Steps to Cloud Happiness
10 Steps to Cloud Happiness10 Steps to Cloud Happiness
10 Steps to Cloud HappinessAll Things Open
 
R, Spark, Tensorflow, H20.ai Applied to Streaming Analytics
R, Spark, Tensorflow, H20.ai Applied to Streaming AnalyticsR, Spark, Tensorflow, H20.ai Applied to Streaming Analytics
R, Spark, Tensorflow, H20.ai Applied to Streaming AnalyticsKai Wähner
 
THE (IR)RATIONAL INCIDENT RESPONSE: HOW PSYCHOLOGICAL BIASES AFFECT INCIDENT ...
THE (IR)RATIONAL INCIDENT RESPONSE: HOW PSYCHOLOGICAL BIASES AFFECT INCIDENT ...THE (IR)RATIONAL INCIDENT RESPONSE: HOW PSYCHOLOGICAL BIASES AFFECT INCIDENT ...
THE (IR)RATIONAL INCIDENT RESPONSE: HOW PSYCHOLOGICAL BIASES AFFECT INCIDENT ...DevOpsDays Tel Aviv
 
3 reasons to pick a time series platform for monitoring dev ops driven contai...
3 reasons to pick a time series platform for monitoring dev ops driven contai...3 reasons to pick a time series platform for monitoring dev ops driven contai...
3 reasons to pick a time series platform for monitoring dev ops driven contai...DevOps.com
 
Supersonic, Subatomic, Kubernetes Native Java : Microservices Day Dallas
Supersonic, Subatomic, Kubernetes Native Java : Microservices Day DallasSupersonic, Subatomic, Kubernetes Native Java : Microservices Day Dallas
Supersonic, Subatomic, Kubernetes Native Java : Microservices Day DallasJeremy Davis
 

Tendances (20)

Combinación de logs, métricas y rastreos para observabilidad unificada
Combinación de logs, métricas y rastreos para observabilidad unificadaCombinación de logs, métricas y rastreos para observabilidad unificada
Combinación de logs, métricas y rastreos para observabilidad unificada
 
Paul Dix [InfluxData] | InfluxDays Keynote: Future of InfluxDB | InfluxDays N...
Paul Dix [InfluxData] | InfluxDays Keynote: Future of InfluxDB | InfluxDays N...Paul Dix [InfluxData] | InfluxDays Keynote: Future of InfluxDB | InfluxDays N...
Paul Dix [InfluxData] | InfluxDays Keynote: Future of InfluxDB | InfluxDays N...
 
Turning Cloud Metrics into Results
Turning Cloud Metrics into ResultsTurning Cloud Metrics into Results
Turning Cloud Metrics into Results
 
Building an event system on top MongoDB
Building an event system on top MongoDBBuilding an event system on top MongoDB
Building an event system on top MongoDB
 
NetApp keynote for Openstack Silicon Valley 2015
NetApp keynote for Openstack Silicon Valley 2015NetApp keynote for Openstack Silicon Valley 2015
NetApp keynote for Openstack Silicon Valley 2015
 
Study Notes - Architecting for the cloud (AWS Best Practices, Feb 2016)
Study Notes - Architecting for the cloud (AWS Best Practices, Feb 2016)Study Notes - Architecting for the cloud (AWS Best Practices, Feb 2016)
Study Notes - Architecting for the cloud (AWS Best Practices, Feb 2016)
 
Top Considerations For Operating a Kubernetes Environment at Scale
Top Considerations For Operating a Kubernetes Environment at ScaleTop Considerations For Operating a Kubernetes Environment at Scale
Top Considerations For Operating a Kubernetes Environment at Scale
 
DevDay 2018: Martin Schurz - Aufbau einer Monitoringlösung für moderne Applik...
DevDay 2018: Martin Schurz - Aufbau einer Monitoringlösung für moderne Applik...DevDay 2018: Martin Schurz - Aufbau einer Monitoringlösung für moderne Applik...
DevDay 2018: Martin Schurz - Aufbau einer Monitoringlösung für moderne Applik...
 
Keynote
KeynoteKeynote
Keynote
 
Yannis Zarkadas. Enterprise data science workflows on kubeflow
Yannis Zarkadas. Enterprise data science workflows on kubeflowYannis Zarkadas. Enterprise data science workflows on kubeflow
Yannis Zarkadas. Enterprise data science workflows on kubeflow
 
APIdays Paris 2018 - Cloud computing - we went through every steps of the Gar...
APIdays Paris 2018 - Cloud computing - we went through every steps of the Gar...APIdays Paris 2018 - Cloud computing - we went through every steps of the Gar...
APIdays Paris 2018 - Cloud computing - we went through every steps of the Gar...
 
Opening Keynote
Opening KeynoteOpening Keynote
Opening Keynote
 
PAM3: Machine Learning in the Railway Industry ( Predix Transform 2016)
PAM3: Machine Learning in the Railway Industry ( Predix Transform 2016)PAM3: Machine Learning in the Railway Industry ( Predix Transform 2016)
PAM3: Machine Learning in the Railway Industry ( Predix Transform 2016)
 
Build A Better Way to Deliver IT
Build A Better Way to Deliver ITBuild A Better Way to Deliver IT
Build A Better Way to Deliver IT
 
10 Steps to Cloud Happiness
10 Steps to Cloud Happiness10 Steps to Cloud Happiness
10 Steps to Cloud Happiness
 
R, Spark, Tensorflow, H20.ai Applied to Streaming Analytics
R, Spark, Tensorflow, H20.ai Applied to Streaming AnalyticsR, Spark, Tensorflow, H20.ai Applied to Streaming Analytics
R, Spark, Tensorflow, H20.ai Applied to Streaming Analytics
 
THE (IR)RATIONAL INCIDENT RESPONSE: HOW PSYCHOLOGICAL BIASES AFFECT INCIDENT ...
THE (IR)RATIONAL INCIDENT RESPONSE: HOW PSYCHOLOGICAL BIASES AFFECT INCIDENT ...THE (IR)RATIONAL INCIDENT RESPONSE: HOW PSYCHOLOGICAL BIASES AFFECT INCIDENT ...
THE (IR)RATIONAL INCIDENT RESPONSE: HOW PSYCHOLOGICAL BIASES AFFECT INCIDENT ...
 
vSEC pro CISCO ACI
vSEC pro CISCO ACIvSEC pro CISCO ACI
vSEC pro CISCO ACI
 
3 reasons to pick a time series platform for monitoring dev ops driven contai...
3 reasons to pick a time series platform for monitoring dev ops driven contai...3 reasons to pick a time series platform for monitoring dev ops driven contai...
3 reasons to pick a time series platform for monitoring dev ops driven contai...
 
Supersonic, Subatomic, Kubernetes Native Java : Microservices Day Dallas
Supersonic, Subatomic, Kubernetes Native Java : Microservices Day DallasSupersonic, Subatomic, Kubernetes Native Java : Microservices Day Dallas
Supersonic, Subatomic, Kubernetes Native Java : Microservices Day Dallas
 

Similaire à Efficient IT operations using monitoring systems and standardized tools - Icinga Camp Zurich 2019

How to Improve Data Labels and Feedback Loops Through High-Frequency Sensor A...
How to Improve Data Labels and Feedback Loops Through High-Frequency Sensor A...How to Improve Data Labels and Feedback Loops Through High-Frequency Sensor A...
How to Improve Data Labels and Feedback Loops Through High-Frequency Sensor A...InfluxData
 
Presentation predictive maintenance solution with IoT and machine learning_SE...
Presentation predictive maintenance solution with IoT and machine learning_SE...Presentation predictive maintenance solution with IoT and machine learning_SE...
Presentation predictive maintenance solution with IoT and machine learning_SE...Larbi OUIYZME
 
Case Study: Increasing Produban's Critical Systems Availability and Performance
Case Study: Increasing Produban's Critical Systems Availability and PerformanceCase Study: Increasing Produban's Critical Systems Availability and Performance
Case Study: Increasing Produban's Critical Systems Availability and PerformanceCA Technologies
 
Neev Application Performance Management Services
Neev Application Performance Management ServicesNeev Application Performance Management Services
Neev Application Performance Management ServicesNeev Technologies
 
On the Application of AI for Failure Management: Problems, Solutions and Algo...
On the Application of AI for Failure Management: Problems, Solutions and Algo...On the Application of AI for Failure Management: Problems, Solutions and Algo...
On the Application of AI for Failure Management: Problems, Solutions and Algo...Jorge Cardoso
 
Digital Transformation and Process Optimization in Manufacturing
Digital Transformation and Process Optimization in ManufacturingDigital Transformation and Process Optimization in Manufacturing
Digital Transformation and Process Optimization in ManufacturingBigML, Inc
 
Empowering SmartCloud APM - Predictive Insights and Analysis: A Use Case Scen...
Empowering SmartCloud APM - Predictive Insights and Analysis: A Use Case Scen...Empowering SmartCloud APM - Predictive Insights and Analysis: A Use Case Scen...
Empowering SmartCloud APM - Predictive Insights and Analysis: A Use Case Scen...Prolifics
 
Meeting the challenges to adopt visual production management systems hms-whit...
Meeting the challenges to adopt visual production management systems hms-whit...Meeting the challenges to adopt visual production management systems hms-whit...
Meeting the challenges to adopt visual production management systems hms-whit...Ariel Lerer
 
DataProphet Building with AI/ML - AWS Startup Day Johannesburg.pdf
DataProphet Building with AI/ML - AWS Startup Day Johannesburg.pdfDataProphet Building with AI/ML - AWS Startup Day Johannesburg.pdf
DataProphet Building with AI/ML - AWS Startup Day Johannesburg.pdfAmazon Web Services
 
O.M.S. High Tech CNC parts
O.M.S. High Tech CNC partsO.M.S. High Tech CNC parts
O.M.S. High Tech CNC partsO.M.S. s.r.l.
 
Internet of Things Microservices
Internet of Things MicroservicesInternet of Things Microservices
Internet of Things MicroservicesCapgemini
 
Pmo slides jun2010
Pmo slides jun2010Pmo slides jun2010
Pmo slides jun2010Steve Turner
 
10 good reasons to go for model-based systems engineering in your organization
10 good reasons to go for model-based systems engineering in your organization10 good reasons to go for model-based systems engineering in your organization
10 good reasons to go for model-based systems engineering in your organizationSiemens PLM Software
 
Optimizing connected system performance md&m-anaheim-sandhi bhide 02-07-2017
Optimizing connected system performance md&m-anaheim-sandhi bhide 02-07-2017Optimizing connected system performance md&m-anaheim-sandhi bhide 02-07-2017
Optimizing connected system performance md&m-anaheim-sandhi bhide 02-07-2017sandhibhide
 
Waste Management Overflow System using IoT and Classification using Data Mining
Waste Management Overflow System using IoT and Classification using Data MiningWaste Management Overflow System using IoT and Classification using Data Mining
Waste Management Overflow System using IoT and Classification using Data MiningIRJET Journal
 

Similaire à Efficient IT operations using monitoring systems and standardized tools - Icinga Camp Zurich 2019 (20)

How to Improve Data Labels and Feedback Loops Through High-Frequency Sensor A...
How to Improve Data Labels and Feedback Loops Through High-Frequency Sensor A...How to Improve Data Labels and Feedback Loops Through High-Frequency Sensor A...
How to Improve Data Labels and Feedback Loops Through High-Frequency Sensor A...
 
Presentation predictive maintenance solution with IoT and machine learning_SE...
Presentation predictive maintenance solution with IoT and machine learning_SE...Presentation predictive maintenance solution with IoT and machine learning_SE...
Presentation predictive maintenance solution with IoT and machine learning_SE...
 
Case Study: Increasing Produban's Critical Systems Availability and Performance
Case Study: Increasing Produban's Critical Systems Availability and PerformanceCase Study: Increasing Produban's Critical Systems Availability and Performance
Case Study: Increasing Produban's Critical Systems Availability and Performance
 
Neev Application Performance Management Services
Neev Application Performance Management ServicesNeev Application Performance Management Services
Neev Application Performance Management Services
 
On the Application of AI for Failure Management: Problems, Solutions and Algo...
On the Application of AI for Failure Management: Problems, Solutions and Algo...On the Application of AI for Failure Management: Problems, Solutions and Algo...
On the Application of AI for Failure Management: Problems, Solutions and Algo...
 
Mathworks CAE simulation suite – case in point from automotive and aerospace.
Mathworks CAE simulation suite – case in point from automotive and aerospace.Mathworks CAE simulation suite – case in point from automotive and aerospace.
Mathworks CAE simulation suite – case in point from automotive and aerospace.
 
Digital Transformation and Process Optimization in Manufacturing
Digital Transformation and Process Optimization in ManufacturingDigital Transformation and Process Optimization in Manufacturing
Digital Transformation and Process Optimization in Manufacturing
 
Empowering SmartCloud APM - Predictive Insights and Analysis: A Use Case Scen...
Empowering SmartCloud APM - Predictive Insights and Analysis: A Use Case Scen...Empowering SmartCloud APM - Predictive Insights and Analysis: A Use Case Scen...
Empowering SmartCloud APM - Predictive Insights and Analysis: A Use Case Scen...
 
Meeting the challenges to adopt visual production management systems hms-whit...
Meeting the challenges to adopt visual production management systems hms-whit...Meeting the challenges to adopt visual production management systems hms-whit...
Meeting the challenges to adopt visual production management systems hms-whit...
 
DataProphet Building with AI/ML - AWS Startup Day Johannesburg.pdf
DataProphet Building with AI/ML - AWS Startup Day Johannesburg.pdfDataProphet Building with AI/ML - AWS Startup Day Johannesburg.pdf
DataProphet Building with AI/ML - AWS Startup Day Johannesburg.pdf
 
Energy Management Solution - iARMS-EMS/PMS
Energy Management Solution - iARMS-EMS/PMSEnergy Management Solution - iARMS-EMS/PMS
Energy Management Solution - iARMS-EMS/PMS
 
O.M.S. High Tech CNC parts
O.M.S. High Tech CNC partsO.M.S. High Tech CNC parts
O.M.S. High Tech CNC parts
 
Innoslate 4.5 and Sopatra
Innoslate 4.5 and SopatraInnoslate 4.5 and Sopatra
Innoslate 4.5 and Sopatra
 
Internet of Things Microservices
Internet of Things MicroservicesInternet of Things Microservices
Internet of Things Microservices
 
Pmo slides jun2010
Pmo slides jun2010Pmo slides jun2010
Pmo slides jun2010
 
10 good reasons to go for model-based systems engineering in your organization
10 good reasons to go for model-based systems engineering in your organization10 good reasons to go for model-based systems engineering in your organization
10 good reasons to go for model-based systems engineering in your organization
 
Optimizing connected system performance md&m-anaheim-sandhi bhide 02-07-2017
Optimizing connected system performance md&m-anaheim-sandhi bhide 02-07-2017Optimizing connected system performance md&m-anaheim-sandhi bhide 02-07-2017
Optimizing connected system performance md&m-anaheim-sandhi bhide 02-07-2017
 
Sap education knoa
Sap education   knoa Sap education   knoa
Sap education knoa
 
The ZDLC Brief
The ZDLC BriefThe ZDLC Brief
The ZDLC Brief
 
Waste Management Overflow System using IoT and Classification using Data Mining
Waste Management Overflow System using IoT and Classification using Data MiningWaste Management Overflow System using IoT and Classification using Data Mining
Waste Management Overflow System using IoT and Classification using Data Mining
 

Plus de Icinga

Upgrading Incident Management with Icinga - Icinga Camp Milan 2023
Upgrading Incident Management with Icinga - Icinga Camp Milan 2023Upgrading Incident Management with Icinga - Icinga Camp Milan 2023
Upgrading Incident Management with Icinga - Icinga Camp Milan 2023Icinga
 
Extending Icinga Web with Modules: powerful, smart and easily created - Icing...
Extending Icinga Web with Modules: powerful, smart and easily created - Icing...Extending Icinga Web with Modules: powerful, smart and easily created - Icing...
Extending Icinga Web with Modules: powerful, smart and easily created - Icing...Icinga
 
Infrastructure Monitoring for Cloud Native Enterprises - Icinga Camp Milan 2023
Infrastructure Monitoring for Cloud Native Enterprises - Icinga Camp Milan 2023Infrastructure Monitoring for Cloud Native Enterprises - Icinga Camp Milan 2023
Infrastructure Monitoring for Cloud Native Enterprises - Icinga Camp Milan 2023Icinga
 
Incident management: Best industry practices your team should know - Icinga C...
Incident management: Best industry practices your team should know - Icinga C...Incident management: Best industry practices your team should know - Icinga C...
Incident management: Best industry practices your team should know - Icinga C...Icinga
 
Monitoring Cooling Units in a pharmaceutical GxP regulated environment - Icin...
Monitoring Cooling Units in a pharmaceutical GxP regulated environment - Icin...Monitoring Cooling Units in a pharmaceutical GxP regulated environment - Icin...
Monitoring Cooling Units in a pharmaceutical GxP regulated environment - Icin...Icinga
 
SNMP Monitoring at scale - Icinga Camp Milan 2023
SNMP Monitoring at scale - Icinga Camp Milan 2023SNMP Monitoring at scale - Icinga Camp Milan 2023
SNMP Monitoring at scale - Icinga Camp Milan 2023Icinga
 
Monitoring Kubernetes with Icinga - Icinga Camp Milan 2023
Monitoring Kubernetes with Icinga - Icinga Camp Milan 2023Monitoring Kubernetes with Icinga - Icinga Camp Milan 2023
Monitoring Kubernetes with Icinga - Icinga Camp Milan 2023Icinga
 
Current State of Icinga - Icinga Camp Milan 2023
Current State of Icinga - Icinga Camp Milan 2023Current State of Icinga - Icinga Camp Milan 2023
Current State of Icinga - Icinga Camp Milan 2023Icinga
 
Signalilo: Visualizing Prometheus alerts in Icinga2 - Icinga Camp Zurich 2019
Signalilo: Visualizing Prometheus alerts in Icinga2 - Icinga Camp Zurich 2019Signalilo: Visualizing Prometheus alerts in Icinga2 - Icinga Camp Zurich 2019
Signalilo: Visualizing Prometheus alerts in Icinga2 - Icinga Camp Zurich 2019Icinga
 
Moving from Icinga 1 to Icinga 2 + Director - Icinga Camp Zurich 2019
Moving from Icinga 1 to Icinga 2 + Director - Icinga Camp Zurich 2019Moving from Icinga 1 to Icinga 2 + Director - Icinga Camp Zurich 2019
Moving from Icinga 1 to Icinga 2 + Director - Icinga Camp Zurich 2019Icinga
 
Icinga Director and vSphereDB - how they play together - Icinga Camp Zurich 2019
Icinga Director and vSphereDB - how they play together - Icinga Camp Zurich 2019Icinga Director and vSphereDB - how they play together - Icinga Camp Zurich 2019
Icinga Director and vSphereDB - how they play together - Icinga Camp Zurich 2019Icinga
 
Current State of Icinga - Icinga Camp Zurich 2019
Current State of Icinga - Icinga Camp Zurich 2019Current State of Icinga - Icinga Camp Zurich 2019
Current State of Icinga - Icinga Camp Zurich 2019Icinga
 
NetEye 4 based on Icinga 2 - Icinga Camp Milan 2019
NetEye 4 based on Icinga 2 - Icinga Camp Milan 2019NetEye 4 based on Icinga 2 - Icinga Camp Milan 2019
NetEye 4 based on Icinga 2 - Icinga Camp Milan 2019Icinga
 
Integrating Icinga 2 and ntopng - Icinga Camp Milan 2019
Integrating Icinga 2 and ntopng - Icinga Camp Milan 2019Integrating Icinga 2 and ntopng - Icinga Camp Milan 2019
Integrating Icinga 2 and ntopng - Icinga Camp Milan 2019Icinga
 
DevOps monitoring: Best Practices using OpenShift combined with Icinga & Big ...
DevOps monitoring: Best Practices using OpenShift combined with Icinga & Big ...DevOps monitoring: Best Practices using OpenShift combined with Icinga & Big ...
DevOps monitoring: Best Practices using OpenShift combined with Icinga & Big ...Icinga
 
Current State of Icinga - Icinga Camp Milan 2019
Current State of Icinga - Icinga Camp Milan 2019Current State of Icinga - Icinga Camp Milan 2019
Current State of Icinga - Icinga Camp Milan 2019Icinga
 
Best of Icinga Modules - Icinga Camp Milan 2019
Best of Icinga Modules - Icinga Camp Milan 2019Best of Icinga Modules - Icinga Camp Milan 2019
Best of Icinga Modules - Icinga Camp Milan 2019Icinga
 
hallenges of Monitoring Big Infrastructure - Icinga Camp Milan 2019
hallenges of Monitoring Big Infrastructure - Icinga Camp Milan 2019hallenges of Monitoring Big Infrastructure - Icinga Camp Milan 2019
hallenges of Monitoring Big Infrastructure - Icinga Camp Milan 2019Icinga
 
Discover the real user experience with Alyvix - Icinga Camp Milan 2019
Discover the real user experience with Alyvix - Icinga Camp Milan 2019Discover the real user experience with Alyvix - Icinga Camp Milan 2019
Discover the real user experience with Alyvix - Icinga Camp Milan 2019Icinga
 
Current State of Logmanagement with Icinga - Icinga Camp Stockholm 2019
Current State of Logmanagement with Icinga - Icinga Camp Stockholm 2019Current State of Logmanagement with Icinga - Icinga Camp Stockholm 2019
Current State of Logmanagement with Icinga - Icinga Camp Stockholm 2019Icinga
 

Plus de Icinga (20)

Upgrading Incident Management with Icinga - Icinga Camp Milan 2023
Upgrading Incident Management with Icinga - Icinga Camp Milan 2023Upgrading Incident Management with Icinga - Icinga Camp Milan 2023
Upgrading Incident Management with Icinga - Icinga Camp Milan 2023
 
Extending Icinga Web with Modules: powerful, smart and easily created - Icing...
Extending Icinga Web with Modules: powerful, smart and easily created - Icing...Extending Icinga Web with Modules: powerful, smart and easily created - Icing...
Extending Icinga Web with Modules: powerful, smart and easily created - Icing...
 
Infrastructure Monitoring for Cloud Native Enterprises - Icinga Camp Milan 2023
Infrastructure Monitoring for Cloud Native Enterprises - Icinga Camp Milan 2023Infrastructure Monitoring for Cloud Native Enterprises - Icinga Camp Milan 2023
Infrastructure Monitoring for Cloud Native Enterprises - Icinga Camp Milan 2023
 
Incident management: Best industry practices your team should know - Icinga C...
Incident management: Best industry practices your team should know - Icinga C...Incident management: Best industry practices your team should know - Icinga C...
Incident management: Best industry practices your team should know - Icinga C...
 
Monitoring Cooling Units in a pharmaceutical GxP regulated environment - Icin...
Monitoring Cooling Units in a pharmaceutical GxP regulated environment - Icin...Monitoring Cooling Units in a pharmaceutical GxP regulated environment - Icin...
Monitoring Cooling Units in a pharmaceutical GxP regulated environment - Icin...
 
SNMP Monitoring at scale - Icinga Camp Milan 2023
SNMP Monitoring at scale - Icinga Camp Milan 2023SNMP Monitoring at scale - Icinga Camp Milan 2023
SNMP Monitoring at scale - Icinga Camp Milan 2023
 
Monitoring Kubernetes with Icinga - Icinga Camp Milan 2023
Monitoring Kubernetes with Icinga - Icinga Camp Milan 2023Monitoring Kubernetes with Icinga - Icinga Camp Milan 2023
Monitoring Kubernetes with Icinga - Icinga Camp Milan 2023
 
Current State of Icinga - Icinga Camp Milan 2023
Current State of Icinga - Icinga Camp Milan 2023Current State of Icinga - Icinga Camp Milan 2023
Current State of Icinga - Icinga Camp Milan 2023
 
Signalilo: Visualizing Prometheus alerts in Icinga2 - Icinga Camp Zurich 2019
Signalilo: Visualizing Prometheus alerts in Icinga2 - Icinga Camp Zurich 2019Signalilo: Visualizing Prometheus alerts in Icinga2 - Icinga Camp Zurich 2019
Signalilo: Visualizing Prometheus alerts in Icinga2 - Icinga Camp Zurich 2019
 
Moving from Icinga 1 to Icinga 2 + Director - Icinga Camp Zurich 2019
Moving from Icinga 1 to Icinga 2 + Director - Icinga Camp Zurich 2019Moving from Icinga 1 to Icinga 2 + Director - Icinga Camp Zurich 2019
Moving from Icinga 1 to Icinga 2 + Director - Icinga Camp Zurich 2019
 
Icinga Director and vSphereDB - how they play together - Icinga Camp Zurich 2019
Icinga Director and vSphereDB - how they play together - Icinga Camp Zurich 2019Icinga Director and vSphereDB - how they play together - Icinga Camp Zurich 2019
Icinga Director and vSphereDB - how they play together - Icinga Camp Zurich 2019
 
Current State of Icinga - Icinga Camp Zurich 2019
Current State of Icinga - Icinga Camp Zurich 2019Current State of Icinga - Icinga Camp Zurich 2019
Current State of Icinga - Icinga Camp Zurich 2019
 
NetEye 4 based on Icinga 2 - Icinga Camp Milan 2019
NetEye 4 based on Icinga 2 - Icinga Camp Milan 2019NetEye 4 based on Icinga 2 - Icinga Camp Milan 2019
NetEye 4 based on Icinga 2 - Icinga Camp Milan 2019
 
Integrating Icinga 2 and ntopng - Icinga Camp Milan 2019
Integrating Icinga 2 and ntopng - Icinga Camp Milan 2019Integrating Icinga 2 and ntopng - Icinga Camp Milan 2019
Integrating Icinga 2 and ntopng - Icinga Camp Milan 2019
 
DevOps monitoring: Best Practices using OpenShift combined with Icinga & Big ...
DevOps monitoring: Best Practices using OpenShift combined with Icinga & Big ...DevOps monitoring: Best Practices using OpenShift combined with Icinga & Big ...
DevOps monitoring: Best Practices using OpenShift combined with Icinga & Big ...
 
Current State of Icinga - Icinga Camp Milan 2019
Current State of Icinga - Icinga Camp Milan 2019Current State of Icinga - Icinga Camp Milan 2019
Current State of Icinga - Icinga Camp Milan 2019
 
Best of Icinga Modules - Icinga Camp Milan 2019
Best of Icinga Modules - Icinga Camp Milan 2019Best of Icinga Modules - Icinga Camp Milan 2019
Best of Icinga Modules - Icinga Camp Milan 2019
 
hallenges of Monitoring Big Infrastructure - Icinga Camp Milan 2019
hallenges of Monitoring Big Infrastructure - Icinga Camp Milan 2019hallenges of Monitoring Big Infrastructure - Icinga Camp Milan 2019
hallenges of Monitoring Big Infrastructure - Icinga Camp Milan 2019
 
Discover the real user experience with Alyvix - Icinga Camp Milan 2019
Discover the real user experience with Alyvix - Icinga Camp Milan 2019Discover the real user experience with Alyvix - Icinga Camp Milan 2019
Discover the real user experience with Alyvix - Icinga Camp Milan 2019
 
Current State of Logmanagement with Icinga - Icinga Camp Stockholm 2019
Current State of Logmanagement with Icinga - Icinga Camp Stockholm 2019Current State of Logmanagement with Icinga - Icinga Camp Stockholm 2019
Current State of Logmanagement with Icinga - Icinga Camp Stockholm 2019
 

Dernier

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 

Dernier (20)

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 

Efficient IT operations using monitoring systems and standardized tools - Icinga Camp Zurich 2019

  • 1. KMG Group GmbH, http://www.kmggroup.ch Magnus Lübeck, Zürich, 2019-11-12 http://kmg.group Icinga Day Zürich 2019
  • 2. 2 Sysadmin since the 90’s Unix/Oracle at Volvo Pre sales, Sun Microsystems reseller Oracle DBA at CERN IT Operations manager at Accarda IT Operations manager at Kanton LU Owner of KMG Group GmbH Built infrastructure and operations at Swisscom peaq Serafe This is me
  • 5. 5 Quick overview People, tools and processes The four fielder Telemetry and health Desire lines OSS and Free software in modern operations environments Tool landscape Icinga’s part in the mechano Outline
  • 6. 6 The stack – game of tetris
  • 8. 8 Telemetry is part of good systems design Measurement points should be a mandatory point of EVERY system This has been know since many years, across many industries Metrics - /status, /health
  • 9. 9 The use of waveforms to diagnose broken things is far from new. The triangular form is particularly useful. Can be used in many ways Very useful for repetitive patterns. Metrics - /status, /health
  • 12. 12 A fool with a tool is still a fool. Get smart people Use tools Integrate the tools with your environment. Tools can cost money But does not have to Operational tools
  • 13. 13 Implement simple processes Use the right tools, and don’t make the processes complicated. Processes - desire lines
  • 15. 15 Naming conventions No servers named after porn stars Baseline installations Mini OS install Automation/ Infrastructure as code Ansible, chef, puppet Coding guidelines Standards
  • 16. 16 Inception in so many levels Deals with ”less than 24/7” SLAs You can service check your SLA Shameless plug – SLA check
  • 17. 17 The four fielder Technical Monitoring Telemetry Operational tools Inventories Configuration Management Admin Gui Orchestration IAM Ticketing Dashboarding Documentation Remote Access Code repository Artifact Repository Application Specific Tools Application Specific Tools Application Specific Tools Application Specific Tools SLA Monitoring Audience Spectrum
  • 18. 18 Remote access Systems monitoring Documentation Identity management Ticketing Inventory (not CMDB) Automation/Orchestration Telemetry (Technical performance monitoring) Dashboarding Technical tools (sysadmin toolbox) SLA monitoring Tool landscape Technical Monitoring Telemetry Operational tools Inventories Configuration Management Admin Gui Orchestration IAM Ticketing Dashboarding Documentation Remote Access Code repository Artifact Repository Application Specific Tools Application Specific Tools Application Specific Tools Application Specific Tools SLA Monitoring Audience Spectrum
  • 19. 19 A customer of mine had 8’500 Open Critical Alerts 15’300 Warnings Typical “cry wolf” scenario 3 possible/allowed Actions Solve the problem Change the threshold (change the metric, template, standard) Remove the alert Monitoring theory: Bad design reduces the value of your monitoring
  • 20. 20 Move the responsibility of delivering telemetry to the application designers and the application owners Help them learn how to write service checks A service delivery is not complete unless telemetry and monitoring packages are delivered Application service check responsibility devOps or stoneAgeOps?
  • 21. 21 Question from an auditor (ISO-27001 audit) How do you ensure that all applications work after a patch run My answer: We don’t The big audit monster
  • 24. 24 One stop shop icinga Service Checks Application Application Application Application Application Application ApplicationScheduled Tasks Notifications Signage Raspberry Pi 4 with 2 screens Darboard Smashing Telemetry and logging
  • 26. 26 Manually edit config – use it when you learn Icinga Good ways to do it Automate icinga centric configuration repository - director Icinga API – write the integration yourself Automation per Ansible Metamonitoring By using your inventory, you know what you are monitoring And, what you are not monitoring Icinga client and service registration
  • 27. 27 The layer cake is your monitoring standard grouped by common denominators. Group service checks in layers (i.e L0 – L5) L0 – OS Level - (Linux admin) CPU, disk usage, ssh, ping, fs usage {/, /var, /home} L1 – Server type – shared OS resources (Linux Admin) iops on db fs, fs usage on /app/ora … L5 – Application checks – (Application Managers) Application specific checks The Layered Cake
  • 28. 28 The human brain is excellent at identifying harmonies and regularities. Ingredient number 2: Sawtooth waveform
  • 29. 29 The human brain is excellent at identifying harmonies and regularities. Ingredient number 2: Sawtooth waveform