OwnIT Through Proactive Zenoss Monitoring

© 2016 All Rights Reserved CONFIDENTIAL#GALAXZ16
#GALAXZ16
OwnIT Through Proactive Monitoring
Quis custodiet ipsos custodes?
Who will monitor the monitors themselves?
@jstanley232
1
Jason Stanley
Enterprise Monitoring Engineer @Secure_24
jstanley734@gmail.com
Github.com/jstanley23
Zenoss Community Forums/IRC: jstanley

© 2016 All Rights Reserved CONFIDENTIAL#GALAXZ16 2
Secure-24 has 15 years of experience delivering managed IT operations, application
hosting and cloud services to enterprises worldwide. We manage SAP, Oracle, Hyperion,
JD Edwards, and other mission critical applications across all industries and for
businesses of every size. Our industry-leading client satisfaction rates result from
lowering IT operational costs and our relentless focus on superior service and support.

Zenoss is the primary monitoring tool
for infrastructure, client devices and
applications.
3

Replaced other
monitoring platforms
with Zenoss
• Oracle Enterprise Manager
• Solarwinds
• Nimsoft
• Nagios
• Tidal

Primary Zenoss environment
• Zenoss 4.2.5 RPS 538
• 100+ ZenPacks
• 9k+ devices
• 1.7m+ data points
• Dedicated servers
• 3 dedicated Hubs
• 16 dedicated multi-tenant collectors
• 9 customer dedicated collectors

Monitoring from within
Zenoss provides a lot of built-in self monitoring and additional ZenPacks.
 Zenoss Daemons
› Processes
› Heartbeats
 Zenoss Toolbox Scans
 Tracebacks and exceptions
 ZenPacks
› ZenPacks.zenoss.MySqlMonitor
› ZenPacks.Zenoss.RabbitMQ
› ZenPacks.Zenoss.Memcached
6

Daemon monitoring
Built-in Methods
 Process
› Most daemon processes are already added
› Polls every 3 minutes
› Monitors CPU, memory, and count
 /Status/Heartbeat
› Takes longer to spawn event than processes
› Can signify issues with the daemon or hub
 Note:
› Verify new daemons are added to processes
› Heartbeats are same instance only
7

Zenoss ZenPacks
 ZenPacks.zenoss.MySqlMonitor *
› Critical to monitor up/down
› Primary use internal is graphs and trending
 ZenPacks.Zenoss.RabbitMQ *
› Critical to monitor up/down
› Primary use internal is graphs and trending
 ZenPacks.Zenoss.Memcached
› Can be monitoring internally for up/down
› Can have negative user experience if down
*Should monitor externally
8

Zenoss Toolbox Scans and Exceptions Events
https://github.com/zenoss/zenoss.toolbox
 Setup scans in crontab to set and forget
 All toolbox scans now create events!
 Warning:
› Do not run zencatalogscan –f without
zenrelationscan and findposkeyerror coming
back clean first.
9
Exceptions and tracebacks
 Modelers, datasources and templates can
error out
 Check your events for sneaky errors:
› Message: traceback
› Message: exception
 TALES exceptions will come in under the
Hub’s full name and is a single event.

Event Monitoring
Event flow in Zenoss is one of the more important
aspects of the tool. Without events, you will not be
alerted to any issues in your environments.
For this reason, we place a special need on monitoring
this aspect.

Monitoring from afar
We focus on monitoring Zenoss event flow from a remote
location. In case Zenoss goes down, we will still get alerted.
 Zenoss Webserver
 RabbitMQ
› rawevents
› zenevents
› signal
 Zeneventserver
 Synthetic Event Checks
› zeneventd
 Event processing and transforms
› Zeneventserver
 Changing event state
11

Web (Http) checks
12
Both zenwebserver and zeneventserver can be
monitored with a simple http check.
 zenwebserver
› Http check to 8080 to the Dashboard URL with a regex
 /zport/dmd/Dashboard
 zeneventserver
› Http check to 8084 to hit the zeneventserver API
 /zeneventserver/api/1.0/events

RabbitMQ
13
Very important to monitor RabbitMQ
queues. If something happens with
RabbitMQ, event processing is
compromised in Zenoss.
For this reason, we will monitor the
queues remotely. Alerting on anything
above a certain threshold.*
* This threshold should be set depending on your environment.
 We see 3 queues are the most important.
› rawevents
 Where raw events from the collectors are sent
› zenevents
 After events are processed by zeneventd, they are sent here for
zeneventserver
› signal
 Events that are true for any trigger and need to be processed by
a notification are sent here for zenactiond to process.

Synthetic Checks
14
 Pre-existing event check
› Checks the functionality of zeneventserver
by
 Acknowledging a pre-existing event *
 Un-acknowledging a pre-existing event *
› Verifies the following is up and running:
 ZenDS
 zeneventserver
 zenwebserver
› Only uses a single event, if the event is
closed a new one must be created
• Script can be used to create event for you and provide the event
ID to use
 New event check
› Checks the Zenoss event process by:
 Opening a new event
 Finding new event
 Verifying event was modified by transform
 Closing event
 Verifying event was closed
› Verifies the following is up and running:
 ZenDS
 zenwebserver
 zeneventd
 zeneventserver
› Creates a new event each and every time

Take Aways
The script we use for monitoring can be found on the
community wiki or on github.com
Along with documentation on how to use it.
http://wiki.zenoss.org/Monitoring_Zenoss
https://github.com/jstanley23/MonitoringZenoss

Question me this

OwnIT Through Proactive Zenoss Monitoring

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (20)

Similaire à OwnIT Through Proactive Zenoss Monitoring

Similaire à OwnIT Through Proactive Zenoss Monitoring (20)

Plus de Zenoss

Plus de Zenoss (20)

Dernier

Dernier (20)

OwnIT Through Proactive Zenoss Monitoring

Notes de l'éditeur