Sean will be discussing several approaches to notification types for real world Nagios deployments. This will include a few methods for handling on call rosters, sending SMS from fully visualized data centers, and resilient notifications by integrating with phone systems for voice notifications.
3. What is this about?
• Why resilient notifications?
– Notifications are IMPORTANT
• They can also be ANNOYING
– Monitoring a system which you rely on for
communications is good
• Relying on that system to notify you when you
have a problem is probably not best practice
8. My favorite ways to notify
• Email
• SMS
• Voice
• Service desk tickets
9. Pro’s for SMS
• Simple to send
• Web/email gateways
• SMS Modem / connected phone
10. Con’s for SMS
• No way to know if it was received
• No way to tell if it was read
• Cannot be automatically forwarded by
phone network
11. So what's the answer?
• There are commercial products out there
– Some can be costly
– Free products like Pushbullet and telegram are
good, but may not suit due to internet connectivity
and as free services no SLA contract
– Apps like aNag are good but have downsides
such as internet / vpn connectivity and the app
sometimes has issues
12. I love voice
• You can forward a mobile phone / IP
phone
• You can listen to a voice message in a car
(well in Australia you can)
• You can acknowledge with voice or
keypad
20. Use multiple methods
• Combine multiple methods to achieve a
solution that you KNOW will deliver the
notification when you need it
21. Problem: on call rosters
• Created by people for people, not
predictable – if you never have people
swap your employees are robots
22. Escalate notifications
• Nagios has built in support for notifications
escalations
• Avoid using the same communications
method for the escalation
• Can be used to escalate from one method
to another for the same contact
23. Problem: Excessive Notifications
• Operators ignore more notifications than
they action
• Important notifications are missed
• Do not notify on unreachable
24. Solutions?
• Most successful solutions require a third
party web interface to select who is on call
– Works well
– Requires access to the web interface
25. My preferred way?
• Use the corporate phone system
– System speed calls allow you to divert
numbers
– Remote hot desk allows changing diversions
securely without computer
– Phone system has redundancy / resiliency
built in from its original design
26. Right tool for the job
• Pick the right notification type
do we want to be interrupted with a loud
notification type for issues that can happen
frequently but only matter if they are sustained?
NO probably not
27. Don’t rely just on “smart” phones
• If using SMS, Voice. Email or APPS it is
easy to forget that the person who is on
call is probably going to access all four on
the same device.
28. Resources
Here’s what you do:
Read about notification escalations here http://tinyurl.com/qx68m65
Read about status and reachability here http://tinyurl.com/qfoyrue
Want scripts? Or want to share yours?????
http://exchange.nagios.org/