SlideShare une entreprise Scribd logo
1  sur  110
Télécharger pour lire hors ligne
OSMC 2012
Jean Gabès
Why monitor ?
When IT get bad, it can be dangerous for
business
Or even more....
So to save the world business :
Monitoring tools !
Many of them :
For pure IT monitoring, Nagios™® is the last 10
years reference ...
… thanks to several modules
●
Mod_gearman : Lan distribution
●
LiveStatus : data access
●
Thruk/Multisite/NagVis : real-time view
●
PNP, Graphite : graphs
Plugins & modularity ARE great !
But maybe now it's not enough ?
IT is getting bigger & bigger
With multiple layers (physical, network,
virtual, …)
With lot of clusters everywhere
And distant sites
Classic IT monitoring difficulties
●
Too much load (plugins, notif latency, ...)
●
Hard to maintain configuration
●
Distant site lost ?
●
High availability
Perfect architecture
Yes you can stack Nagios modules & scripts to
nearly solve this ...
… or you can just use Shinken :)
Shinken is a full Nagios™® rewrite in Python
Shinken
Huge community activity
Icinga
Dedicated Linux
Mag issue in France
(July)
With Shinken, by design :
●
Raid like high availability
●
Multi levels load balancing (DMZ, LAN, inter-
datacenters)
●
Multiplatform (yes it also means Windows, and
even Android)
●
Good speed
●
In core business rules (& | Xof:)
No more problems for setting up the monitoring,
so what if we look at 2012+ admin problems ?
Configuration simplification
●
Escalations defined with templates
●
Recurring downtimes are just a timeperiod to set
for an host
●
Easier service dependencies definitions
Virtualization is everywhere
If an ESX crash, you don't want to receive 30+
hosts down for the VM on it !
Only the ESX host down one, so ok, easy : just
setup host dependencies :)
But you know that Vmware admins are funny
guys
They « VMotion » VMs as often as a Perl coder
type $_
So forget about flat file host dep configuration :)
You can just use Shinken Vmware module. You
only need check_esx3.pl for it (thanks to OP5
guys!)
OK, but what about reducing the « worse thing
for an admin » ?
It's not coffee/beer outage...
False alerts !
Example : is critical on testing so critical ?
Rule N°1 for the admins :
Never touch the production a friday
Production is all that matters
Rule N°2  :
Production is all that matters
Is critical on testing so critical ? → NO !
Ok easy : notifications_enabled
More complex, but more « real world » : a
production switch breaks a testing app
Do you need to awake the admin @3AM for
this ?? No !
The key is the root problem analysis + business
impact level on « apps »
And what about time based importance ? For
example : your paid service is only « important » 3
days a month
Business impacts modulations
define businessimpactmodulation{
business_impact_modulation_name Paid_IS_Important
business_impact 5
modulation_period PaidPeriod  ; 3 days period
}
define service{
service_description Paid
use generic-service
check_command bp_rule! paie-srv,bdd & paie-srv,http
host_name Applications
business_impact 3
business_impact_modulations Paid_IS_Important
}
Strong differences in Shinken between root
problems & impacts
And between « importance » levels, more than
just warning/critical
Ok for notifications, but what about Uis ?
Shinken WebUI got its own philosophy
●
Strong separation between problems & impacts
●
Focus on (huge) business impacts
●
Dependencies are the key, show them all !
●
Aggregate all load balanced elements
●
HA by design
●
Very “visual” (dependencies, alerts, graphs)
●
HTML5 everywhere (sorry for IE6...)
●
Only useful info are show, other are hidden by
default
●
Linkable to others Uis (PNP, graphite) as
modules
●
Even your boss will understand it
●
And so will night shift operators !
Two main (incompatible) user types
●
Boss : want to see end-users apps impacts (and
why it's down...)
●
Admins : want to see what IT elements are the
problems
●
Root problems VS impacts view
●
No one want to see both
●
All is sorted by business impact of course
What your boss will see
What the admin will see
And if the admin want to show why it's so
important
Both will understand the dep graph
And each one can have it's own dashboard, with
its own widgets
●
To test it : demo-shinken.web4all.fr
●
Like Shinken, the UI is modular (like PNP or
Graphite inclusion)
OK, we see what we need to see, and only this.
Great.
But the heavier task is still here : we need to add
our new hosts in it :p
Fact : templates are GREAT !
●
Shinken extends the Nagios configuration logic
●
Services on hostgroups where good, but why
add a server to the linux hostgroups if you already
“link” with the linux template?
Can be great to have complex expression like
« Linux&Prod » for service linking
We can only « tags » our hosts, and not multiply
our hostgroups (like linux,production tags
instead of linux,production,linuxproductions
groups)
●
O(n) data versus O(n²)
●
Too much service definitions
●
You can't avoid host definition, but you can try to
reduce your service number
●
Let drop service centric data to an host centric
one
●
Which disk volume check is an host data, not a
service one
●
Which database check is an host data, not a
service one
●
Get back configuration data from service to the
hosts
●
Less services defined, more template usage
●
More host custom macros
●
Key : duplicate_foreach keyword in Shinken
●
Generate a service for each « value » in an
custom macros
Define host{
host_name srv-lin-1
Use linux
_disks /, /var, /data
}
Define service {
host_name linux
Register 0
Description Disk $KEY$
check_command check_disk!$KEY$
duplicate_foreach _disks
}
Define host{
host_name big-switch-stack
Use switch
_ports Unit [1-6] Port [1-48]
}
Define service {
host_name switch
Register 0
Description Port $KEY$
check_command check_port!$KEY$
duplicate_foreach _ports
}
You will have : 6*48 services with one definition!
Fact : a good IT guy IS lazy
Fact : admins are good IT guys!
So and admin don't want to :
●
Write plugins from scratch
●
Manually tag their hosts
●
Wrote the .cfg files for a new server flavor too
●
(in fact all they want is systems to run by themselves and go take coffee)
Why manually fill tags or customs for your hosts,
when you can write rules about it?
Example : IP range based rule module. If the host
is in a IP range you can automatically add a
property to it :
●
If in DMZ : will be checked by a DMZ poller
●
If in testing LAN : no notifications
●
If behind a router : add the router as parent
Example : IP range based rule module.
define module{
module_name Ip_VLAN_10
module_type ip_tag
ip_range 10.0.100.0/24
property parents
value gw_vlan_100
method replace
}
But still need to « add » hosts....
Shinken discovery !
●
Runners : script that « scan » and output 'data'
●
Rules : read data and generate host/service from
it
Ex : nmap runner scan an host and output 'data'
$ nmap_discovery_runner.py -t localhost
localhost::isup=1
localhost::os=linux
localhost::osversion=2.6.x
localhost::osvendor=linux
localhost::macvendor=hp
localhost::openports=22,443,3306
localhost::fqdn=localhost
localhost::ip=127.0.0.1
Sample rule for linux tag
define discoveryrule {
discoveryrule_name Linux
creation_type host
os linux ; what we match
+use linux ; what we wrote in the object, here
; append the linux template
}
Sample rule for Https tag
define discoveryrule {
discoveryrule_name Https
creation_type host
openports 443 ; if we got the 443 port ...
+use Https ; … add the Https template
}
localhost : use ssh,mysql,https,linux
Multi-level discovery
●
1 If you match a data
●
2 Launch a new runner
●
3 apply new rules
●
4 GOTO 1
Ex : Windows shares discovery
define discoveryrun {
discoveryrun_name WindowsShares
discoveryrun_command discovery_windows_share
# And scan only windows detected hosts!
os windows
}
Result
define host {
host_name win-srv
use windows
_shares Work,Public,Private
}
CLI launch :
shinken-discovery -c etc/discovery.cfg --db Mongodb -m
'NMAPTARGET=localhost'
Or better : sKonf UI !
SKonf :
●
UI for easy configuration management
●
Can use discovery or a more « classic » way
●
Manage Shinken specific properties
●
(good) beta version from now
Let get back from configuration to more
monitoring logic
Sometime external check plugins can't help you
(for example : a server with a Collectd daemon)
Such pure passive data is hard to manage and
« check »
Solution : Triggers (yes, like in Zabbix)
●
.trig files (in fact python source)
●
A trigger is linked to hosts/services in the
configuration
●
Will « run » after a check (or a new passive data)
●
Can do what ever they want in the core!
Sample :
# self = number of users collectd service for the host
nb_users = perf(self, 'users')
warn = int(get_custom(self.host, '_users_warn'))
crit = int(get_custom(self.host, '_users_crit'))
return_code = 0
output = 'Check OK'
if nb_users > warn:
output = 'Warning : users are too high %s' % nb_users
return_code = 1
if nb_users > crit:
output = 'Critical : users are too high %s' % nb_users
return_code = 2
set_value(self, output=output, return_code=return_code)
Ok that's won't replace NRPE or check_mk, but
can be useful for log parsing with a syslog listener
module or a SNMP Trap parser one for example...
… or for more advanced things like KPI
computation, or even advanced correlations
Sample : compute the avg time of N web servers
times = perfs("srv-web-*/Http", 'time')
avg_time = sum(times)/len(times)
set_value(self, output='OK', perfdata='avgtime=%dms' % avg_time, return_code=0)
Sample : advanced correlation rule
bd_state = state("srv-bdd”,”Oracle")
avg_time = perf("srv-web/AvgTime", 'avgtime')
return_code = 0
output = 'Check OK'
if bd_state == 'WARNING' or avg_time > 5:
output = 'Warning : the application is in degraded mode'
return_code = 1
if bd_state == 'CRITICAL' or avg_time > 10:
output = 'Critical : the application is down!'
return_code = 2
set_value(self, output=output, return_code=return_code)
How to install Shinken?
Quite easy :
# curl -L http://install.shinken-monitoring.org | /bin/bash
Conclusion ?
●
Be lazy and take coffee
●
The Shinken architecture is done and powerful
●
Lot of improvements in the monitoring logic
compare to Nagios™®
●
WebUI is great, sKonf will be great soon
●
Soon professional support from a “Shinken
Enterprise »
THANKS :)
Questions ?

Contenu connexe

Tendances

Nagios Conference 2011 - Michael Medin - NSClient++: Whats New
Nagios Conference 2011 - Michael Medin - NSClient++: Whats NewNagios Conference 2011 - Michael Medin - NSClient++: Whats New
Nagios Conference 2011 - Michael Medin - NSClient++: Whats NewNagios
 
Prometheus (Prometheus London, 2016)
Prometheus (Prometheus London, 2016)Prometheus (Prometheus London, 2016)
Prometheus (Prometheus London, 2016)Brian Brazil
 
OSMC 2017 | Log Monitoring with Logstash and Icinga by Walter Heck
OSMC 2017 | Log Monitoring with Logstash and Icinga by Walter HeckOSMC 2017 | Log Monitoring with Logstash and Icinga by Walter Heck
OSMC 2017 | Log Monitoring with Logstash and Icinga by Walter HeckNETWAYS
 
Ansible at FOSDEM (Ansible Dublin, 2016)
Ansible at FOSDEM (Ansible Dublin, 2016)Ansible at FOSDEM (Ansible Dublin, 2016)
Ansible at FOSDEM (Ansible Dublin, 2016)Brian Brazil
 
Systems Monitoring with Prometheus (Devops Ireland April 2015)
Systems Monitoring with Prometheus (Devops Ireland April 2015)Systems Monitoring with Prometheus (Devops Ireland April 2015)
Systems Monitoring with Prometheus (Devops Ireland April 2015)Brian Brazil
 
Microservices and Prometheus (Microservices NYC 2016)
Microservices and Prometheus (Microservices NYC 2016)Microservices and Prometheus (Microservices NYC 2016)
Microservices and Prometheus (Microservices NYC 2016)Brian Brazil
 
Prometheus - Open Source Forum Japan
Prometheus  - Open Source Forum JapanPrometheus  - Open Source Forum Japan
Prometheus - Open Source Forum JapanBrian Brazil
 
Open Source Monitoring for Java with JMX and Graphite (GeeCON 2013)
Open Source Monitoring for Java with JMX and Graphite (GeeCON 2013)Open Source Monitoring for Java with JMX and Graphite (GeeCON 2013)
Open Source Monitoring for Java with JMX and Graphite (GeeCON 2013)Cyrille Le Clerc
 
Dev Talk: Event Manipulation and Testing
Dev Talk: Event Manipulation and TestingDev Talk: Event Manipulation and Testing
Dev Talk: Event Manipulation and TestingJason Stanley
 
Nagios Conference 2011 - Jeff Sly - Case Study Nagios @ Nu Skin
Nagios Conference 2011 - Jeff Sly - Case Study Nagios @ Nu SkinNagios Conference 2011 - Jeff Sly - Case Study Nagios @ Nu Skin
Nagios Conference 2011 - Jeff Sly - Case Study Nagios @ Nu SkinNagios
 
Monitoring using Prometheus and Grafana
Monitoring using Prometheus and GrafanaMonitoring using Prometheus and Grafana
Monitoring using Prometheus and GrafanaArvind Kumar G.S
 
Prometheus for Monitoring Metrics (Percona Live Europe 2017)
Prometheus for Monitoring Metrics (Percona Live Europe 2017)Prometheus for Monitoring Metrics (Percona Live Europe 2017)
Prometheus for Monitoring Metrics (Percona Live Europe 2017)Brian Brazil
 
Nagios Conference 2014 - Jim Prins - Passive Monitoring with Nagios
Nagios Conference 2014 - Jim Prins - Passive Monitoring with NagiosNagios Conference 2014 - Jim Prins - Passive Monitoring with Nagios
Nagios Conference 2014 - Jim Prins - Passive Monitoring with NagiosNagios
 
How to monitor your micro-service with Prometheus?
How to monitor your micro-service with Prometheus?How to monitor your micro-service with Prometheus?
How to monitor your micro-service with Prometheus?Wojciech Barczyński
 
Hands-on monitoring with Prometheus
Hands-on monitoring with PrometheusHands-on monitoring with Prometheus
Hands-on monitoring with PrometheusBrice Fernandes
 
Monitoring your Python with Prometheus (Python Ireland April 2015)
Monitoring your Python with Prometheus (Python Ireland April 2015)Monitoring your Python with Prometheus (Python Ireland April 2015)
Monitoring your Python with Prometheus (Python Ireland April 2015)Brian Brazil
 

Tendances (20)

Nagios Conference 2011 - Michael Medin - NSClient++: Whats New
Nagios Conference 2011 - Michael Medin - NSClient++: Whats NewNagios Conference 2011 - Michael Medin - NSClient++: Whats New
Nagios Conference 2011 - Michael Medin - NSClient++: Whats New
 
Prometheus (Prometheus London, 2016)
Prometheus (Prometheus London, 2016)Prometheus (Prometheus London, 2016)
Prometheus (Prometheus London, 2016)
 
OSMC 2017 | Log Monitoring with Logstash and Icinga by Walter Heck
OSMC 2017 | Log Monitoring with Logstash and Icinga by Walter HeckOSMC 2017 | Log Monitoring with Logstash and Icinga by Walter Heck
OSMC 2017 | Log Monitoring with Logstash and Icinga by Walter Heck
 
HowTo DR
HowTo DRHowTo DR
HowTo DR
 
Ansible at FOSDEM (Ansible Dublin, 2016)
Ansible at FOSDEM (Ansible Dublin, 2016)Ansible at FOSDEM (Ansible Dublin, 2016)
Ansible at FOSDEM (Ansible Dublin, 2016)
 
Systems Monitoring with Prometheus (Devops Ireland April 2015)
Systems Monitoring with Prometheus (Devops Ireland April 2015)Systems Monitoring with Prometheus (Devops Ireland April 2015)
Systems Monitoring with Prometheus (Devops Ireland April 2015)
 
Microservices and Prometheus (Microservices NYC 2016)
Microservices and Prometheus (Microservices NYC 2016)Microservices and Prometheus (Microservices NYC 2016)
Microservices and Prometheus (Microservices NYC 2016)
 
Prometheus with Grafana - AddWeb Solution
Prometheus with Grafana - AddWeb SolutionPrometheus with Grafana - AddWeb Solution
Prometheus with Grafana - AddWeb Solution
 
Prometheus - Open Source Forum Japan
Prometheus  - Open Source Forum JapanPrometheus  - Open Source Forum Japan
Prometheus - Open Source Forum Japan
 
Open Source Monitoring for Java with JMX and Graphite (GeeCON 2013)
Open Source Monitoring for Java with JMX and Graphite (GeeCON 2013)Open Source Monitoring for Java with JMX and Graphite (GeeCON 2013)
Open Source Monitoring for Java with JMX and Graphite (GeeCON 2013)
 
Dev Talk: Event Manipulation and Testing
Dev Talk: Event Manipulation and TestingDev Talk: Event Manipulation and Testing
Dev Talk: Event Manipulation and Testing
 
Nagios Conference 2011 - Jeff Sly - Case Study Nagios @ Nu Skin
Nagios Conference 2011 - Jeff Sly - Case Study Nagios @ Nu SkinNagios Conference 2011 - Jeff Sly - Case Study Nagios @ Nu Skin
Nagios Conference 2011 - Jeff Sly - Case Study Nagios @ Nu Skin
 
Sensu Monitoring
Sensu MonitoringSensu Monitoring
Sensu Monitoring
 
Monitoring using Prometheus and Grafana
Monitoring using Prometheus and GrafanaMonitoring using Prometheus and Grafana
Monitoring using Prometheus and Grafana
 
Prometheus for Monitoring Metrics (Percona Live Europe 2017)
Prometheus for Monitoring Metrics (Percona Live Europe 2017)Prometheus for Monitoring Metrics (Percona Live Europe 2017)
Prometheus for Monitoring Metrics (Percona Live Europe 2017)
 
Sensu
SensuSensu
Sensu
 
Nagios Conference 2014 - Jim Prins - Passive Monitoring with Nagios
Nagios Conference 2014 - Jim Prins - Passive Monitoring with NagiosNagios Conference 2014 - Jim Prins - Passive Monitoring with Nagios
Nagios Conference 2014 - Jim Prins - Passive Monitoring with Nagios
 
How to monitor your micro-service with Prometheus?
How to monitor your micro-service with Prometheus?How to monitor your micro-service with Prometheus?
How to monitor your micro-service with Prometheus?
 
Hands-on monitoring with Prometheus
Hands-on monitoring with PrometheusHands-on monitoring with Prometheus
Hands-on monitoring with Prometheus
 
Monitoring your Python with Prometheus (Python Ireland April 2015)
Monitoring your Python with Prometheus (Python Ireland April 2015)Monitoring your Python with Prometheus (Python Ireland April 2015)
Monitoring your Python with Prometheus (Python Ireland April 2015)
 

Similaire à OSMC 2012 | Shinken by Jean Gabès

Eko10 Workshop Opensource Database Auditing
Eko10  Workshop Opensource Database AuditingEko10  Workshop Opensource Database Auditing
Eko10 Workshop Opensource Database AuditingJuan Berner
 
Eko10 workshop - OPEN SOURCE DATABASE MONITORING
Eko10 workshop - OPEN SOURCE DATABASE MONITORINGEko10 workshop - OPEN SOURCE DATABASE MONITORING
Eko10 workshop - OPEN SOURCE DATABASE MONITORINGPablo Garbossa
 
BP206 - Let's Give Your LotusScript a Tune-Up
BP206 - Let's Give Your LotusScript a Tune-Up BP206 - Let's Give Your LotusScript a Tune-Up
BP206 - Let's Give Your LotusScript a Tune-Up Craig Schumann
 
Aws uk ug #8 not everything that happens in vegas stay in vegas
Aws uk ug #8   not everything that happens in vegas stay in vegasAws uk ug #8   not everything that happens in vegas stay in vegas
Aws uk ug #8 not everything that happens in vegas stay in vegasPeter Mounce
 
Serverless? How (not) to develop, deploy and operate serverless applications.
Serverless? How (not) to develop, deploy and operate serverless applications.Serverless? How (not) to develop, deploy and operate serverless applications.
Serverless? How (not) to develop, deploy and operate serverless applications.gjdevos
 
There is something about serverless
There is something about serverlessThere is something about serverless
There is something about serverlessgjdevos
 
Thinking DevOps in the era of the Cloud - Demi Ben-Ari
Thinking DevOps in the era of the Cloud - Demi Ben-AriThinking DevOps in the era of the Cloud - Demi Ben-Ari
Thinking DevOps in the era of the Cloud - Demi Ben-AriDemi Ben-Ari
 
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...Codemotion
 
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...Demi Ben-Ari
 
Democratizing Serverless: the New Open Source, Cloud Agnostic Functions Platf...
Democratizing Serverless: the New Open Source, Cloud Agnostic Functions Platf...Democratizing Serverless: the New Open Source, Cloud Agnostic Functions Platf...
Democratizing Serverless: the New Open Source, Cloud Agnostic Functions Platf...Codemotion
 
Devops with Python by Yaniv Cohen DevopShift
Devops with Python by Yaniv Cohen DevopShiftDevops with Python by Yaniv Cohen DevopShift
Devops with Python by Yaniv Cohen DevopShiftYaniv cohen
 
SCM Puppet: from an intro to the scaling
SCM Puppet: from an intro to the scalingSCM Puppet: from an intro to the scaling
SCM Puppet: from an intro to the scalingStanislav Osipov
 
OpenNebulaConf 2013 - Monitoring of OpenNebula installations by Florian Heigl
OpenNebulaConf 2013 - Monitoring of OpenNebula installations by Florian Heigl OpenNebulaConf 2013 - Monitoring of OpenNebula installations by Florian Heigl
OpenNebulaConf 2013 - Monitoring of OpenNebula installations by Florian Heigl OpenNebula Project
 
Monitoring of OpenNebula installations
Monitoring of OpenNebula installationsMonitoring of OpenNebula installations
Monitoring of OpenNebula installationsNETWAYS
 
Gatling - Bordeaux JUG
Gatling - Bordeaux JUGGatling - Bordeaux JUG
Gatling - Bordeaux JUGslandelle
 
OSMC 2014 | Naemon 1, 2, 3, N by Andreas Ericsson
OSMC 2014 | Naemon 1, 2, 3, N by Andreas EricssonOSMC 2014 | Naemon 1, 2, 3, N by Andreas Ericsson
OSMC 2014 | Naemon 1, 2, 3, N by Andreas EricssonNETWAYS
 
Wayfair Storefront Performance Monitoring with InfluxEnterprise by Richard La...
Wayfair Storefront Performance Monitoring with InfluxEnterprise by Richard La...Wayfair Storefront Performance Monitoring with InfluxEnterprise by Richard La...
Wayfair Storefront Performance Monitoring with InfluxEnterprise by Richard La...InfluxData
 
Free GitOps Workshop
Free GitOps WorkshopFree GitOps Workshop
Free GitOps WorkshopWeaveworks
 
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017Demi Ben-Ari
 
My "Perfect" Toolchain Setup for Grails Projects
My "Perfect" Toolchain Setup for Grails ProjectsMy "Perfect" Toolchain Setup for Grails Projects
My "Perfect" Toolchain Setup for Grails ProjectsGR8Conf
 

Similaire à OSMC 2012 | Shinken by Jean Gabès (20)

Eko10 Workshop Opensource Database Auditing
Eko10  Workshop Opensource Database AuditingEko10  Workshop Opensource Database Auditing
Eko10 Workshop Opensource Database Auditing
 
Eko10 workshop - OPEN SOURCE DATABASE MONITORING
Eko10 workshop - OPEN SOURCE DATABASE MONITORINGEko10 workshop - OPEN SOURCE DATABASE MONITORING
Eko10 workshop - OPEN SOURCE DATABASE MONITORING
 
BP206 - Let's Give Your LotusScript a Tune-Up
BP206 - Let's Give Your LotusScript a Tune-Up BP206 - Let's Give Your LotusScript a Tune-Up
BP206 - Let's Give Your LotusScript a Tune-Up
 
Aws uk ug #8 not everything that happens in vegas stay in vegas
Aws uk ug #8   not everything that happens in vegas stay in vegasAws uk ug #8   not everything that happens in vegas stay in vegas
Aws uk ug #8 not everything that happens in vegas stay in vegas
 
Serverless? How (not) to develop, deploy and operate serverless applications.
Serverless? How (not) to develop, deploy and operate serverless applications.Serverless? How (not) to develop, deploy and operate serverless applications.
Serverless? How (not) to develop, deploy and operate serverless applications.
 
There is something about serverless
There is something about serverlessThere is something about serverless
There is something about serverless
 
Thinking DevOps in the era of the Cloud - Demi Ben-Ari
Thinking DevOps in the era of the Cloud - Demi Ben-AriThinking DevOps in the era of the Cloud - Demi Ben-Ari
Thinking DevOps in the era of the Cloud - Demi Ben-Ari
 
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...
 
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...
 
Democratizing Serverless: the New Open Source, Cloud Agnostic Functions Platf...
Democratizing Serverless: the New Open Source, Cloud Agnostic Functions Platf...Democratizing Serverless: the New Open Source, Cloud Agnostic Functions Platf...
Democratizing Serverless: the New Open Source, Cloud Agnostic Functions Platf...
 
Devops with Python by Yaniv Cohen DevopShift
Devops with Python by Yaniv Cohen DevopShiftDevops with Python by Yaniv Cohen DevopShift
Devops with Python by Yaniv Cohen DevopShift
 
SCM Puppet: from an intro to the scaling
SCM Puppet: from an intro to the scalingSCM Puppet: from an intro to the scaling
SCM Puppet: from an intro to the scaling
 
OpenNebulaConf 2013 - Monitoring of OpenNebula installations by Florian Heigl
OpenNebulaConf 2013 - Monitoring of OpenNebula installations by Florian Heigl OpenNebulaConf 2013 - Monitoring of OpenNebula installations by Florian Heigl
OpenNebulaConf 2013 - Monitoring of OpenNebula installations by Florian Heigl
 
Monitoring of OpenNebula installations
Monitoring of OpenNebula installationsMonitoring of OpenNebula installations
Monitoring of OpenNebula installations
 
Gatling - Bordeaux JUG
Gatling - Bordeaux JUGGatling - Bordeaux JUG
Gatling - Bordeaux JUG
 
OSMC 2014 | Naemon 1, 2, 3, N by Andreas Ericsson
OSMC 2014 | Naemon 1, 2, 3, N by Andreas EricssonOSMC 2014 | Naemon 1, 2, 3, N by Andreas Ericsson
OSMC 2014 | Naemon 1, 2, 3, N by Andreas Ericsson
 
Wayfair Storefront Performance Monitoring with InfluxEnterprise by Richard La...
Wayfair Storefront Performance Monitoring with InfluxEnterprise by Richard La...Wayfair Storefront Performance Monitoring with InfluxEnterprise by Richard La...
Wayfair Storefront Performance Monitoring with InfluxEnterprise by Richard La...
 
Free GitOps Workshop
Free GitOps WorkshopFree GitOps Workshop
Free GitOps Workshop
 
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
 
My "Perfect" Toolchain Setup for Grails Projects
My "Perfect" Toolchain Setup for Grails ProjectsMy "Perfect" Toolchain Setup for Grails Projects
My "Perfect" Toolchain Setup for Grails Projects
 

Dernier

WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...Shane Coughlan
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfonteinmasabamasaba
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024VictoriaMetrics
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxAnnaArtyushina1
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...masabamasaba
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...masabamasaba
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is insideshinachiaurasa2
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...masabamasaba
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech studentsHimanshiGarg82
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park masabamasaba
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrainmasabamasaba
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrandmasabamasaba
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...chiefasafspells
 

Dernier (20)

WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptx
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
 
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
 

OSMC 2012 | Shinken by Jean Gabès

  • 3. When IT get bad, it can be dangerous for business
  • 4.
  • 6.
  • 7. So to save the world business : Monitoring tools !
  • 9. For pure IT monitoring, Nagios™® is the last 10 years reference ...
  • 10. … thanks to several modules
  • 11. ● Mod_gearman : Lan distribution ● LiveStatus : data access ● Thruk/Multisite/NagVis : real-time view ● PNP, Graphite : graphs
  • 12. Plugins & modularity ARE great !
  • 13. But maybe now it's not enough ?
  • 14. IT is getting bigger & bigger
  • 15. With multiple layers (physical, network, virtual, …)
  • 16. With lot of clusters everywhere
  • 18. Classic IT monitoring difficulties ● Too much load (plugins, notif latency, ...) ● Hard to maintain configuration ● Distant site lost ? ● High availability
  • 20. Yes you can stack Nagios modules & scripts to nearly solve this ...
  • 21. … or you can just use Shinken :)
  • 22. Shinken is a full Nagios™® rewrite in Python
  • 24. Dedicated Linux Mag issue in France (July)
  • 25. With Shinken, by design : ● Raid like high availability ● Multi levels load balancing (DMZ, LAN, inter- datacenters) ● Multiplatform (yes it also means Windows, and even Android) ● Good speed ● In core business rules (& | Xof:)
  • 26. No more problems for setting up the monitoring, so what if we look at 2012+ admin problems ?
  • 27. Configuration simplification ● Escalations defined with templates ● Recurring downtimes are just a timeperiod to set for an host ● Easier service dependencies definitions
  • 29. If an ESX crash, you don't want to receive 30+ hosts down for the VM on it !
  • 30. Only the ESX host down one, so ok, easy : just setup host dependencies :)
  • 31. But you know that Vmware admins are funny guys
  • 32. They « VMotion » VMs as often as a Perl coder type $_
  • 33. So forget about flat file host dep configuration :)
  • 34. You can just use Shinken Vmware module. You only need check_esx3.pl for it (thanks to OP5 guys!)
  • 35.
  • 36. OK, but what about reducing the « worse thing for an admin » ?
  • 37. It's not coffee/beer outage...
  • 39. Example : is critical on testing so critical ?
  • 40. Rule N°1 for the admins : Never touch the production a friday Production is all that matters
  • 41. Rule N°2  : Production is all that matters
  • 42. Is critical on testing so critical ? → NO ! Ok easy : notifications_enabled
  • 43. More complex, but more « real world » : a production switch breaks a testing app
  • 44. Do you need to awake the admin @3AM for this ?? No !
  • 45. The key is the root problem analysis + business impact level on « apps »
  • 46.
  • 47.
  • 48. And what about time based importance ? For example : your paid service is only « important » 3 days a month
  • 49. Business impacts modulations define businessimpactmodulation{ business_impact_modulation_name Paid_IS_Important business_impact 5 modulation_period PaidPeriod  ; 3 days period } define service{ service_description Paid use generic-service check_command bp_rule! paie-srv,bdd & paie-srv,http host_name Applications business_impact 3 business_impact_modulations Paid_IS_Important }
  • 50. Strong differences in Shinken between root problems & impacts
  • 51. And between « importance » levels, more than just warning/critical
  • 52. Ok for notifications, but what about Uis ?
  • 53. Shinken WebUI got its own philosophy
  • 54. ● Strong separation between problems & impacts ● Focus on (huge) business impacts ● Dependencies are the key, show them all ! ● Aggregate all load balanced elements ● HA by design
  • 55. ● Very “visual” (dependencies, alerts, graphs) ● HTML5 everywhere (sorry for IE6...) ● Only useful info are show, other are hidden by default ● Linkable to others Uis (PNP, graphite) as modules ● Even your boss will understand it ● And so will night shift operators !
  • 56. Two main (incompatible) user types ● Boss : want to see end-users apps impacts (and why it's down...) ● Admins : want to see what IT elements are the problems
  • 57. ● Root problems VS impacts view ● No one want to see both ● All is sorted by business impact of course
  • 58.
  • 59. What your boss will see
  • 60. What the admin will see
  • 61. And if the admin want to show why it's so important
  • 62. Both will understand the dep graph
  • 63. And each one can have it's own dashboard, with its own widgets
  • 64. ● To test it : demo-shinken.web4all.fr ● Like Shinken, the UI is modular (like PNP or Graphite inclusion)
  • 65. OK, we see what we need to see, and only this. Great.
  • 66. But the heavier task is still here : we need to add our new hosts in it :p
  • 67. Fact : templates are GREAT !
  • 68. ● Shinken extends the Nagios configuration logic ● Services on hostgroups where good, but why add a server to the linux hostgroups if you already “link” with the linux template?
  • 69. Can be great to have complex expression like « Linux&Prod » for service linking
  • 70. We can only « tags » our hosts, and not multiply our hostgroups (like linux,production tags instead of linux,production,linuxproductions groups) ● O(n) data versus O(n²)
  • 71. ● Too much service definitions ● You can't avoid host definition, but you can try to reduce your service number ● Let drop service centric data to an host centric one
  • 72. ● Which disk volume check is an host data, not a service one ● Which database check is an host data, not a service one
  • 73. ● Get back configuration data from service to the hosts ● Less services defined, more template usage ● More host custom macros
  • 74. ● Key : duplicate_foreach keyword in Shinken ● Generate a service for each « value » in an custom macros
  • 75. Define host{ host_name srv-lin-1 Use linux _disks /, /var, /data } Define service { host_name linux Register 0 Description Disk $KEY$ check_command check_disk!$KEY$ duplicate_foreach _disks }
  • 76. Define host{ host_name big-switch-stack Use switch _ports Unit [1-6] Port [1-48] } Define service { host_name switch Register 0 Description Port $KEY$ check_command check_port!$KEY$ duplicate_foreach _ports } You will have : 6*48 services with one definition!
  • 77. Fact : a good IT guy IS lazy
  • 78. Fact : admins are good IT guys!
  • 79. So and admin don't want to : ● Write plugins from scratch ● Manually tag their hosts ● Wrote the .cfg files for a new server flavor too ● (in fact all they want is systems to run by themselves and go take coffee)
  • 80. Why manually fill tags or customs for your hosts, when you can write rules about it?
  • 81. Example : IP range based rule module. If the host is in a IP range you can automatically add a property to it : ● If in DMZ : will be checked by a DMZ poller ● If in testing LAN : no notifications ● If behind a router : add the router as parent Example : IP range based rule module.
  • 82. define module{ module_name Ip_VLAN_10 module_type ip_tag ip_range 10.0.100.0/24 property parents value gw_vlan_100 method replace }
  • 83. But still need to « add » hosts.... Shinken discovery !
  • 84. ● Runners : script that « scan » and output 'data' ● Rules : read data and generate host/service from it
  • 85. Ex : nmap runner scan an host and output 'data' $ nmap_discovery_runner.py -t localhost localhost::isup=1 localhost::os=linux localhost::osversion=2.6.x localhost::osvendor=linux localhost::macvendor=hp localhost::openports=22,443,3306 localhost::fqdn=localhost localhost::ip=127.0.0.1
  • 86. Sample rule for linux tag define discoveryrule { discoveryrule_name Linux creation_type host os linux ; what we match +use linux ; what we wrote in the object, here ; append the linux template }
  • 87. Sample rule for Https tag define discoveryrule { discoveryrule_name Https creation_type host openports 443 ; if we got the 443 port ... +use Https ; … add the Https template }
  • 88. localhost : use ssh,mysql,https,linux
  • 89. Multi-level discovery ● 1 If you match a data ● 2 Launch a new runner ● 3 apply new rules ● 4 GOTO 1
  • 90. Ex : Windows shares discovery define discoveryrun { discoveryrun_name WindowsShares discoveryrun_command discovery_windows_share # And scan only windows detected hosts! os windows }
  • 91. Result define host { host_name win-srv use windows _shares Work,Public,Private }
  • 92. CLI launch : shinken-discovery -c etc/discovery.cfg --db Mongodb -m 'NMAPTARGET=localhost'
  • 93. Or better : sKonf UI !
  • 94. SKonf : ● UI for easy configuration management ● Can use discovery or a more « classic » way ● Manage Shinken specific properties ● (good) beta version from now
  • 95.
  • 96.
  • 97.
  • 98.
  • 99. Let get back from configuration to more monitoring logic Sometime external check plugins can't help you (for example : a server with a Collectd daemon)
  • 100.
  • 101. Such pure passive data is hard to manage and « check »
  • 102. Solution : Triggers (yes, like in Zabbix) ● .trig files (in fact python source) ● A trigger is linked to hosts/services in the configuration ● Will « run » after a check (or a new passive data) ● Can do what ever they want in the core!
  • 103. Sample : # self = number of users collectd service for the host nb_users = perf(self, 'users') warn = int(get_custom(self.host, '_users_warn')) crit = int(get_custom(self.host, '_users_crit')) return_code = 0 output = 'Check OK' if nb_users > warn: output = 'Warning : users are too high %s' % nb_users return_code = 1 if nb_users > crit: output = 'Critical : users are too high %s' % nb_users return_code = 2 set_value(self, output=output, return_code=return_code)
  • 104. Ok that's won't replace NRPE or check_mk, but can be useful for log parsing with a syslog listener module or a SNMP Trap parser one for example...
  • 105. … or for more advanced things like KPI computation, or even advanced correlations
  • 106. Sample : compute the avg time of N web servers times = perfs("srv-web-*/Http", 'time') avg_time = sum(times)/len(times) set_value(self, output='OK', perfdata='avgtime=%dms' % avg_time, return_code=0)
  • 107. Sample : advanced correlation rule bd_state = state("srv-bdd”,”Oracle") avg_time = perf("srv-web/AvgTime", 'avgtime') return_code = 0 output = 'Check OK' if bd_state == 'WARNING' or avg_time > 5: output = 'Warning : the application is in degraded mode' return_code = 1 if bd_state == 'CRITICAL' or avg_time > 10: output = 'Critical : the application is down!' return_code = 2 set_value(self, output=output, return_code=return_code)
  • 108. How to install Shinken? Quite easy : # curl -L http://install.shinken-monitoring.org | /bin/bash
  • 109. Conclusion ? ● Be lazy and take coffee ● The Shinken architecture is done and powerful ● Lot of improvements in the monitoring logic compare to Nagios™® ● WebUI is great, sKonf will be great soon ● Soon professional support from a “Shinken Enterprise »