SlideShare une entreprise Scribd logo
1  sur  27
Monitoring Netezza database 
with Nagios 
Frank Pantaleo 
fpantaleo@brightlightconsulting.com
Introduction & Agenda 
• A couple of W’s 
• State of monitoring Netezza 
• Monitoring Netezza with Nagios 
• Future direction
A couple of W’s - Why 
Why are we monitoring Netezza ? 
• How much $ does your business lose when IT is down ? 
• 7 million each year from IT downtime 
• Gartner (2005) pegs the hourly cost of downtime for computer networks at 
$42,000 
• A data center outage by itself can cost an average of $5,600 per minute 
• Outages damage their reputation 
• Now take this and bring it to a Cloud level - For every hour it is not up and 
running, Amazon.com takes a hit of almost $5 million 
• Allows you to be more proactive 
• Allow upper management to plan for DB growth (includes 
secondary effects e.g. DR, tape, disk for backup)
A Couple of W’s - What 
What are we looking for in a monitor ? 
• Universal monitoring 
• Efficient Alert Notifications (also allows your IT staff to tell 
each other when something is being worked on) 
• Web Dashboard (one stop shopping!) 
• Issue Escalation (separate lists for warning, high) 
• Distributed Monitoring and Scalability (high availability)
A couple of W’s - What 
What are we looking for in a monitor ? (cont) 
• Reporting (how many times was this service down ?) 
• External Application Integration (Can I enable my current 
applications to allow for early issue notification) 
• Open source solution
State of Netezza monitoring 
Monitoring systems available for Netezza 
• Netezza event monitor – comes stock with tool 
• Netezza portal – comes stock with tool 
• Commercial offerings – Brightlight Consulting Observation Deck
State of Netezza monitoring 
Netezza comes with 34 alerts 
Alerts actions have limited responses 
• Email 
• Script execution 
• In Version 7.1 can auto create support ticket 
• Configuration can be done through NPS client or command line interface on 
Netezza server
State of Netezza monitoring 
Examples of Netezza 7.1 stock sample alerts 
• Disk Full 
• SPU Full 
• Hardware Failed 
• Hardware needs attention 
• Hardware restarted 
• Hardware service requested 
• Heat threshold exceeded 
• History capture event 
• History load event 
• HwvoltageFaultAuto 
• NPSNoLongerOnline 
• RegenFault 
• RunAwayQuery 
• No custom events allowed
State of Netezza monitoring 
Netezza Portal 
• Face on glass monitoring 
• Custom queries can be added to the monitor 
• All queries can be seen as numeric or graphic 
• No alerting 
• Tool can also be used for maintaining database objects, 
users, events, and sessions 
• If you are using LDAP, portal can’t take advantage of it. 
Once you login to portal though you will be using your DB 
username/password
Netezza monitoring using Nagios 
What are we monitoring in Netezza ? 
• Table Locks by non-EDW statements during EDW batch 
cycle 
• User queries exceeding 1 hour (90% time poorly formed 
queries) 
• User queries during EDW batch cycle (depends on SLA) 
• Age of backup older than SLA 
• LDAP server available for SSO
Netezza monitoring using Nagios 
What are we monitoring in Netezza ? (cont) 
• SPU space unbalanced (generally a side effect of poor 
distribution) 
• State of EDW e.g. loading files, file processing complete 
• Late arrival of files preventing the EDW from meeting SLA’s
Netezza monitoring using Nagios 
Architecture options with Nagios 
• Sensors live on Nagios monitoring server 
• Sensors live on Database server and are controlled by 
NRPE. This is what we went with based on customer 
security rules. 
• Scripting language is Perl. Really could be any language 
that allows ability to query the database and deal with 
responses. There are other options such as Bash, Java, 
Python, and C.
Netezza monitoring using Nagios 
Architecture options with Nagios (cont) 
• Active – NRPE is a intermediary for running scripts and 
bringing results back to Nagios. 
• Passive – SNMP is an option but current provided alerts need 
to be tied into a SNMP agent that reports status. Netezza 
doesn’t raise SNMP alerts OOB.
Netezza monitoring using Nagios 
Passive alerts require snmp trap software 
 Nagios server must be enabled to receive alerts 
– http://hyper-choi.blogspot.com/2012/12/nagios-snmp-trap-part-1- 
snmptt.html 
– http://hyper-choi.blogspot.com/2013/01/nagios-snmp-trap-part-2- 
configuration.html 
 Once Nagios is enabled Netezza events must be changed 
to make Nagios aware there is a issue 
– http://netezzaadmin.wordpress.com/2011/10/07/using-netezzas-event- 
manager-to-generate-snmp-traps
Netezza monitoring using Nagios 
Passive alerts architecture
Netezza monitoring using Nagios 
Active alerts require NRPE to be installed 
 Checking is done using shell script and Perl 
 Perl DBI ODBC 
 Downside is you have to have a exposed user/password. In 
this case it was against IT policy so I stopped using this option. 
 If we use this though all agents could live on Nagios server 
 Perl supplied package from Netezza 
 Downside is this is equivalent of admin so you can do anything 
 Upside is no username/password configuration 
 Agents must live on Database server
Netezza monitoring using Nagios 
Active Alert architecture
Netezza monitoring using Nagios 
Active Alert agent writing (interface requirements) 
• MUST set a return code e.g. 
• # 0 OK 
• # 1 WARNING 
• # 2 CRITICAL 
• # 3 UNKNOWN 
• Nagios dashboard displays associated text 
if (some logic here ) 
print "Okn"; 
else 
print "Error please look at tablexyzn";
Netezza monitoring using Nagios 
Active alerts - NRPE configuration on Netezza server 
• If using the Perl package commands must run as nz 
user so /etc/nagios/nrpe.cfg must use the following 
– nrpe_user=nz 
– nrpe_group=nz 
• Once a sensor (perl script) is written and tested it 
must be added to nrpe.cfg file. 
• command[check_nz_longqry]=/export/home/nz/scrip 
ts/check_nz_longqry.pl 
• Best practice - Request /etc/nagios/nrpe.cfg be open 
to read/write from nz user
Netezza monitoring using Nagios 
Active alerts - How does NRPE work on Nagios 
server ? 
define command{ 
command_name check_nrpe 
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ -t 300 
} 
define service{ 
use generic-service 
host_name proddb 
service_description NZSQL Long query 
check_command check_nrpe!check_nz_longqry! 
notifications_enabled 0 
}
Netezza monitoring using Nagios 
Active Alerts - Perl programming using SQL.pm package 
• Invocation 
use lib "/nz/kit/share/perl"; 
use nz::SQL; 
• Package can only be used by the nz owner 
• NO username & password 
my ($KITDIR, $DATADIR); 
$DATADIR = "/nz/data.1.0"; 
$KITDIR = "/nz/kit"; 
nz::SQL::config(KITDIR => $KITDIR, DATADIR => $DATADIR); 
• Best practice - use alarm timers around SQL statements 
• Handy variables after each SQL execution $qresp->{nrows}, ncols, 
colid, qtype;
Netezza monitoring using Nagios 
Perl programming using SQL.pm package (continued) 
• Interface example … nz::SQL::query($dbname, $sql). Unlike DBI the database 
must be called out every time you query. 
• Resultsets are not active in database (unlike DBI) they are in perl memory 
• Resultset traversal is done using perl foreach e.g. 
foreach my $row (@{$qresp->{data}}) { 
($blocker_username,$blocker_sql,$blockee_username,$blockee_sql) = 
@$row; 
• Best practice: If you can avoid dealing with resultset and deal only with counts 
e.g (nrows). Most efficient use especially when dealing with a Nagios alert check 
that is going to occur several times a day.
Future direction 
• Data graphing 
• Expand areas that we are monitoring for in Netezza 
• Integrate into a product offering (Observation Deck) from 
Brightlight that collects NZHIST for customer 
• Predict when we are going to outgrow our current 
processing and database needs
Conclusion 
 Key takeaways are 
 Using Nagios can help your company have an extensible 
event monitor. Understanding Nagios architecture is 
important to a stable and working monitoring setup. Once 
you understand architecture setup writing an agent is 
trivial. If you can write SQL to detect an event then you can 
write an agent. 
 Other Reading materials or learning devices on this 
subject that you would like to share 
 URL’s provided in document have the recipe for how to 
setup Nagios, SNMP traps, and Netezza. Please visit those 
sites to get that info.
Questions? 
Any questions? 
Thanks!
Reference 
http://www.thegeekstuff.com/2010/08/monitoring-software-criteria/ 
http://exchange.nagios.org/directory/Tutorials/Install-and-Configure-NRPE-in- 
CentOS-and-Red-Hat/details 
http://www- 
01.ibm.com/support/knowledgecenter/SSULQD_7.1.0/com.ibm.nz.portal.doc 
/c_portal_welcome.html 
http://www.networkworld.com/article/2329877/infrastructure-management/ 
how-to-quantify-downtime.html
The End 
Frank Pantaleo 
fpantaleo@brightlightconsulting.com

Contenu connexe

Tendances

SOUG_Deployment__Automation_DB
SOUG_Deployment__Automation_DBSOUG_Deployment__Automation_DB
SOUG_Deployment__Automation_DB
UniFabric
 
NGENSTOR_ODA_P2V_V5
NGENSTOR_ODA_P2V_V5NGENSTOR_ODA_P2V_V5
NGENSTOR_ODA_P2V_V5
UniFabric
 
SOUG_GV_Flashgrid_V4
SOUG_GV_Flashgrid_V4SOUG_GV_Flashgrid_V4
SOUG_GV_Flashgrid_V4
UniFabric
 
Zettaset Elastic Big Data Security for Greenplum Database
Zettaset Elastic Big Data Security for Greenplum DatabaseZettaset Elastic Big Data Security for Greenplum Database
Zettaset Elastic Big Data Security for Greenplum Database
PivotalOpenSourceHub
 

Tendances (20)

VMworld 2016: Enforcing a vSphere Cluster Design with PowerCLI Automation
VMworld 2016: Enforcing a vSphere Cluster Design with PowerCLI AutomationVMworld 2016: Enforcing a vSphere Cluster Design with PowerCLI Automation
VMworld 2016: Enforcing a vSphere Cluster Design with PowerCLI Automation
 
Generic Resource Manager - László Vadkerti, András Kovács
Generic Resource Manager - László Vadkerti, András KovácsGeneric Resource Manager - László Vadkerti, András Kovács
Generic Resource Manager - László Vadkerti, András Kovács
 
SOUG_Deployment__Automation_DB
SOUG_Deployment__Automation_DBSOUG_Deployment__Automation_DB
SOUG_Deployment__Automation_DB
 
VMworld 2016: Virtualize Active Directory, the Right Way!
VMworld 2016: Virtualize Active Directory, the Right Way! VMworld 2016: Virtualize Active Directory, the Right Way!
VMworld 2016: Virtualize Active Directory, the Right Way!
 
NGENSTOR_ODA_P2V_V5
NGENSTOR_ODA_P2V_V5NGENSTOR_ODA_P2V_V5
NGENSTOR_ODA_P2V_V5
 
Adventures in Research
Adventures in ResearchAdventures in Research
Adventures in Research
 
SOUG_GV_Flashgrid_V4
SOUG_GV_Flashgrid_V4SOUG_GV_Flashgrid_V4
SOUG_GV_Flashgrid_V4
 
[OpenStack Day in Korea 2015] Track 1-6 - 갈라파고스의 이구아나, 인프라에 오픈소스를 올리다. 그래서 보이...
[OpenStack Day in Korea 2015] Track 1-6 - 갈라파고스의 이구아나, 인프라에 오픈소스를 올리다. 그래서 보이...[OpenStack Day in Korea 2015] Track 1-6 - 갈라파고스의 이구아나, 인프라에 오픈소스를 올리다. 그래서 보이...
[OpenStack Day in Korea 2015] Track 1-6 - 갈라파고스의 이구아나, 인프라에 오픈소스를 올리다. 그래서 보이...
 
VMworld 2015: VMware NSX Deep Dive
VMworld 2015: VMware NSX Deep DiveVMworld 2015: VMware NSX Deep Dive
VMworld 2015: VMware NSX Deep Dive
 
Relax-and-Recover Automated Testing
Relax-and-Recover Automated TestingRelax-and-Recover Automated Testing
Relax-and-Recover Automated Testing
 
OpenNebula Conf 2014 | OpenNebula and MooseFS for disaster recovery: real clo...
OpenNebula Conf 2014 | OpenNebula and MooseFS for disaster recovery: real clo...OpenNebula Conf 2014 | OpenNebula and MooseFS for disaster recovery: real clo...
OpenNebula Conf 2014 | OpenNebula and MooseFS for disaster recovery: real clo...
 
Unrevealed Story Behind Viettel Network Cloud Hotpot | Đặng Văn Đại, Hà Mạnh ...
Unrevealed Story Behind Viettel Network Cloud Hotpot | Đặng Văn Đại, Hà Mạnh ...Unrevealed Story Behind Viettel Network Cloud Hotpot | Đặng Văn Đại, Hà Mạnh ...
Unrevealed Story Behind Viettel Network Cloud Hotpot | Đặng Văn Đại, Hà Mạnh ...
 
Zettaset Elastic Big Data Security for Greenplum Database
Zettaset Elastic Big Data Security for Greenplum DatabaseZettaset Elastic Big Data Security for Greenplum Database
Zettaset Elastic Big Data Security for Greenplum Database
 
Red Hat presentatie: Open stack Latest Pure Tech
Red Hat presentatie: Open stack Latest Pure TechRed Hat presentatie: Open stack Latest Pure Tech
Red Hat presentatie: Open stack Latest Pure Tech
 
[OpenStack Day in Korea 2015] Track 3-4 - Software Defined Storage (SDS) and ...
[OpenStack Day in Korea 2015] Track 3-4 - Software Defined Storage (SDS) and ...[OpenStack Day in Korea 2015] Track 3-4 - Software Defined Storage (SDS) and ...
[OpenStack Day in Korea 2015] Track 3-4 - Software Defined Storage (SDS) and ...
 
[OpenStack Day in Korea 2015] Track 2-3 - 오픈스택 클라우드에 최적화된 네트워크 가상화 '누아지(Nuage)'
[OpenStack Day in Korea 2015] Track 2-3 - 오픈스택 클라우드에 최적화된 네트워크 가상화 '누아지(Nuage)'[OpenStack Day in Korea 2015] Track 2-3 - 오픈스택 클라우드에 최적화된 네트워크 가상화 '누아지(Nuage)'
[OpenStack Day in Korea 2015] Track 2-3 - 오픈스택 클라우드에 최적화된 네트워크 가상화 '누아지(Nuage)'
 
OSMC 2021 | Monitoring @ G&D
OSMC 2021 | Monitoring @ G&DOSMC 2021 | Monitoring @ G&D
OSMC 2021 | Monitoring @ G&D
 
Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph Ceph Community Talk on High-Performance Solid Sate Ceph
Ceph Community Talk on High-Performance Solid Sate Ceph
 
Virtual SAN 6.2, hyper-converged infrastructure software
Virtual SAN 6.2, hyper-converged infrastructure softwareVirtual SAN 6.2, hyper-converged infrastructure software
Virtual SAN 6.2, hyper-converged infrastructure software
 
VMworld 2013: VMware Virtual SAN Technical Best Practices
VMworld 2013: VMware Virtual SAN Technical Best Practices VMworld 2013: VMware Virtual SAN Technical Best Practices
VMworld 2013: VMware Virtual SAN Technical Best Practices
 

Similaire à Nagios Conference 2014 - Frank Pantaleo - Nagios Monitoring of Netezza Databases

network-management Web base.ppt
network-management Web base.pptnetwork-management Web base.ppt
network-management Web base.ppt
AssadLeo1
 

Similaire à Nagios Conference 2014 - Frank Pantaleo - Nagios Monitoring of Netezza Databases (20)

Nagios En
Nagios EnNagios En
Nagios En
 
Prometheus - Intro, CNCF, TSDB,PromQL,Grafana
Prometheus - Intro, CNCF, TSDB,PromQL,GrafanaPrometheus - Intro, CNCF, TSDB,PromQL,Grafana
Prometheus - Intro, CNCF, TSDB,PromQL,Grafana
 
Nagios Conference 2013 - Sam Lansing - Getting Started With Nagios XI, Core, ...
Nagios Conference 2013 - Sam Lansing - Getting Started With Nagios XI, Core, ...Nagios Conference 2013 - Sam Lansing - Getting Started With Nagios XI, Core, ...
Nagios Conference 2013 - Sam Lansing - Getting Started With Nagios XI, Core, ...
 
Monitoring federation open stack infrastructure
Monitoring federation open stack infrastructureMonitoring federation open stack infrastructure
Monitoring federation open stack infrastructure
 
Lesson_08_Continuous_Monitoring.pdf
Lesson_08_Continuous_Monitoring.pdfLesson_08_Continuous_Monitoring.pdf
Lesson_08_Continuous_Monitoring.pdf
 
Zabbix Monitoring Platform
Zabbix Monitoring Platform Zabbix Monitoring Platform
Zabbix Monitoring Platform
 
NGINX Installation and Tuning
NGINX Installation and TuningNGINX Installation and Tuning
NGINX Installation and Tuning
 
QueueMetrics - Tips and Tricks
QueueMetrics - Tips and TricksQueueMetrics - Tips and Tricks
QueueMetrics - Tips and Tricks
 
Nagios Conference 2011 - Jeff Sly - Case Study Nagios @ Nu Skin
Nagios Conference 2011 - Jeff Sly - Case Study Nagios @ Nu SkinNagios Conference 2011 - Jeff Sly - Case Study Nagios @ Nu Skin
Nagios Conference 2011 - Jeff Sly - Case Study Nagios @ Nu Skin
 
Nagios Conference 2014 - Eric Mislivec - Getting Started With Nagios Core
Nagios Conference 2014 - Eric Mislivec - Getting Started With Nagios CoreNagios Conference 2014 - Eric Mislivec - Getting Started With Nagios Core
Nagios Conference 2014 - Eric Mislivec - Getting Started With Nagios Core
 
Proactive monitoring tools or services - Open Source
Proactive monitoring tools or services - Open Source Proactive monitoring tools or services - Open Source
Proactive monitoring tools or services - Open Source
 
MySQL Monitoring Shoot Out
MySQL Monitoring Shoot OutMySQL Monitoring Shoot Out
MySQL Monitoring Shoot Out
 
Nagios Conference 2011 - Nate Broderick - Nagios XI Large Implementation Tips...
Nagios Conference 2011 - Nate Broderick - Nagios XI Large Implementation Tips...Nagios Conference 2011 - Nate Broderick - Nagios XI Large Implementation Tips...
Nagios Conference 2011 - Nate Broderick - Nagios XI Large Implementation Tips...
 
network-management Web base.ppt
network-management Web base.pptnetwork-management Web base.ppt
network-management Web base.ppt
 
OSMC 2013 | Monitoring network traffic using ntopng by Luca Deri
OSMC 2013 | Monitoring network traffic using ntopng by Luca DeriOSMC 2013 | Monitoring network traffic using ntopng by Luca Deri
OSMC 2013 | Monitoring network traffic using ntopng by Luca Deri
 
OSMC 2014 | Naemon 1, 2, 3, N by Andreas Ericsson
OSMC 2014 | Naemon 1, 2, 3, N by Andreas EricssonOSMC 2014 | Naemon 1, 2, 3, N by Andreas Ericsson
OSMC 2014 | Naemon 1, 2, 3, N by Andreas Ericsson
 
Datafoucs 2014 on line digital forensic investigations damir delija 2
Datafoucs 2014 on line digital forensic investigations damir delija 2Datafoucs 2014 on line digital forensic investigations damir delija 2
Datafoucs 2014 on line digital forensic investigations damir delija 2
 
EnCase Enterprise Basic File Collection
EnCase Enterprise Basic File Collection EnCase Enterprise Basic File Collection
EnCase Enterprise Basic File Collection
 
System monitoring
System monitoringSystem monitoring
System monitoring
 
OSMC 2008 | Monitoring Tools Shootout by Tom De Cooman
OSMC 2008 | Monitoring Tools Shootout by Tom De CoomanOSMC 2008 | Monitoring Tools Shootout by Tom De Cooman
OSMC 2008 | Monitoring Tools Shootout by Tom De Cooman
 

Plus de Nagios

Plus de Nagios (20)

Nagios XI Best Practices
Nagios XI Best PracticesNagios XI Best Practices
Nagios XI Best Practices
 
Jesse Olson - Nagios Log Server Architecture Overview
Jesse Olson - Nagios Log Server Architecture OverviewJesse Olson - Nagios Log Server Architecture Overview
Jesse Olson - Nagios Log Server Architecture Overview
 
Trevor McDonald - Nagios XI Under The Hood
Trevor McDonald  - Nagios XI Under The HoodTrevor McDonald  - Nagios XI Under The Hood
Trevor McDonald - Nagios XI Under The Hood
 
Sean Falzon - Nagios - Resilient Notifications
Sean Falzon - Nagios - Resilient NotificationsSean Falzon - Nagios - Resilient Notifications
Sean Falzon - Nagios - Resilient Notifications
 
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise Edition
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise EditionMarcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise Edition
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise Edition
 
Janice Singh - Writing Custom Nagios Plugins
Janice Singh - Writing Custom Nagios PluginsJanice Singh - Writing Custom Nagios Plugins
Janice Singh - Writing Custom Nagios Plugins
 
Dave Williams - Nagios Log Server - Practical Experience
Dave Williams - Nagios Log Server - Practical ExperienceDave Williams - Nagios Log Server - Practical Experience
Dave Williams - Nagios Log Server - Practical Experience
 
Mike Weber - Nagios and Group Deployment of Service Checks
Mike Weber - Nagios and Group Deployment of Service ChecksMike Weber - Nagios and Group Deployment of Service Checks
Mike Weber - Nagios and Group Deployment of Service Checks
 
Mike Guthrie - Revamping Your 10 Year Old Nagios Installation
Mike Guthrie - Revamping Your 10 Year Old Nagios InstallationMike Guthrie - Revamping Your 10 Year Old Nagios Installation
Mike Guthrie - Revamping Your 10 Year Old Nagios Installation
 
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
 
Matt Bruzek - Monitoring Your Public Cloud With Nagios
Matt Bruzek - Monitoring Your Public Cloud With NagiosMatt Bruzek - Monitoring Your Public Cloud With Nagios
Matt Bruzek - Monitoring Your Public Cloud With Nagios
 
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
 
Eric Loyd - Fractal Nagios
Eric Loyd - Fractal NagiosEric Loyd - Fractal Nagios
Eric Loyd - Fractal Nagios
 
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
 
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
 
Nagios World Conference 2015 - Scott Wilkerson Opening
Nagios World Conference 2015 - Scott Wilkerson OpeningNagios World Conference 2015 - Scott Wilkerson Opening
Nagios World Conference 2015 - Scott Wilkerson Opening
 
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios CoreNrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
 
Nagios Log Server - Features
Nagios Log Server - FeaturesNagios Log Server - Features
Nagios Log Server - Features
 
Nagios Network Analyzer - Features
Nagios Network Analyzer - FeaturesNagios Network Analyzer - Features
Nagios Network Analyzer - Features
 
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing NagiosNagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
 

Dernier

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Dernier (20)

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 

Nagios Conference 2014 - Frank Pantaleo - Nagios Monitoring of Netezza Databases

  • 1. Monitoring Netezza database with Nagios Frank Pantaleo fpantaleo@brightlightconsulting.com
  • 2. Introduction & Agenda • A couple of W’s • State of monitoring Netezza • Monitoring Netezza with Nagios • Future direction
  • 3. A couple of W’s - Why Why are we monitoring Netezza ? • How much $ does your business lose when IT is down ? • 7 million each year from IT downtime • Gartner (2005) pegs the hourly cost of downtime for computer networks at $42,000 • A data center outage by itself can cost an average of $5,600 per minute • Outages damage their reputation • Now take this and bring it to a Cloud level - For every hour it is not up and running, Amazon.com takes a hit of almost $5 million • Allows you to be more proactive • Allow upper management to plan for DB growth (includes secondary effects e.g. DR, tape, disk for backup)
  • 4. A Couple of W’s - What What are we looking for in a monitor ? • Universal monitoring • Efficient Alert Notifications (also allows your IT staff to tell each other when something is being worked on) • Web Dashboard (one stop shopping!) • Issue Escalation (separate lists for warning, high) • Distributed Monitoring and Scalability (high availability)
  • 5. A couple of W’s - What What are we looking for in a monitor ? (cont) • Reporting (how many times was this service down ?) • External Application Integration (Can I enable my current applications to allow for early issue notification) • Open source solution
  • 6. State of Netezza monitoring Monitoring systems available for Netezza • Netezza event monitor – comes stock with tool • Netezza portal – comes stock with tool • Commercial offerings – Brightlight Consulting Observation Deck
  • 7. State of Netezza monitoring Netezza comes with 34 alerts Alerts actions have limited responses • Email • Script execution • In Version 7.1 can auto create support ticket • Configuration can be done through NPS client or command line interface on Netezza server
  • 8. State of Netezza monitoring Examples of Netezza 7.1 stock sample alerts • Disk Full • SPU Full • Hardware Failed • Hardware needs attention • Hardware restarted • Hardware service requested • Heat threshold exceeded • History capture event • History load event • HwvoltageFaultAuto • NPSNoLongerOnline • RegenFault • RunAwayQuery • No custom events allowed
  • 9. State of Netezza monitoring Netezza Portal • Face on glass monitoring • Custom queries can be added to the monitor • All queries can be seen as numeric or graphic • No alerting • Tool can also be used for maintaining database objects, users, events, and sessions • If you are using LDAP, portal can’t take advantage of it. Once you login to portal though you will be using your DB username/password
  • 10. Netezza monitoring using Nagios What are we monitoring in Netezza ? • Table Locks by non-EDW statements during EDW batch cycle • User queries exceeding 1 hour (90% time poorly formed queries) • User queries during EDW batch cycle (depends on SLA) • Age of backup older than SLA • LDAP server available for SSO
  • 11. Netezza monitoring using Nagios What are we monitoring in Netezza ? (cont) • SPU space unbalanced (generally a side effect of poor distribution) • State of EDW e.g. loading files, file processing complete • Late arrival of files preventing the EDW from meeting SLA’s
  • 12. Netezza monitoring using Nagios Architecture options with Nagios • Sensors live on Nagios monitoring server • Sensors live on Database server and are controlled by NRPE. This is what we went with based on customer security rules. • Scripting language is Perl. Really could be any language that allows ability to query the database and deal with responses. There are other options such as Bash, Java, Python, and C.
  • 13. Netezza monitoring using Nagios Architecture options with Nagios (cont) • Active – NRPE is a intermediary for running scripts and bringing results back to Nagios. • Passive – SNMP is an option but current provided alerts need to be tied into a SNMP agent that reports status. Netezza doesn’t raise SNMP alerts OOB.
  • 14. Netezza monitoring using Nagios Passive alerts require snmp trap software  Nagios server must be enabled to receive alerts – http://hyper-choi.blogspot.com/2012/12/nagios-snmp-trap-part-1- snmptt.html – http://hyper-choi.blogspot.com/2013/01/nagios-snmp-trap-part-2- configuration.html  Once Nagios is enabled Netezza events must be changed to make Nagios aware there is a issue – http://netezzaadmin.wordpress.com/2011/10/07/using-netezzas-event- manager-to-generate-snmp-traps
  • 15. Netezza monitoring using Nagios Passive alerts architecture
  • 16. Netezza monitoring using Nagios Active alerts require NRPE to be installed  Checking is done using shell script and Perl  Perl DBI ODBC  Downside is you have to have a exposed user/password. In this case it was against IT policy so I stopped using this option.  If we use this though all agents could live on Nagios server  Perl supplied package from Netezza  Downside is this is equivalent of admin so you can do anything  Upside is no username/password configuration  Agents must live on Database server
  • 17. Netezza monitoring using Nagios Active Alert architecture
  • 18. Netezza monitoring using Nagios Active Alert agent writing (interface requirements) • MUST set a return code e.g. • # 0 OK • # 1 WARNING • # 2 CRITICAL • # 3 UNKNOWN • Nagios dashboard displays associated text if (some logic here ) print "Okn"; else print "Error please look at tablexyzn";
  • 19. Netezza monitoring using Nagios Active alerts - NRPE configuration on Netezza server • If using the Perl package commands must run as nz user so /etc/nagios/nrpe.cfg must use the following – nrpe_user=nz – nrpe_group=nz • Once a sensor (perl script) is written and tested it must be added to nrpe.cfg file. • command[check_nz_longqry]=/export/home/nz/scrip ts/check_nz_longqry.pl • Best practice - Request /etc/nagios/nrpe.cfg be open to read/write from nz user
  • 20. Netezza monitoring using Nagios Active alerts - How does NRPE work on Nagios server ? define command{ command_name check_nrpe command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ -t 300 } define service{ use generic-service host_name proddb service_description NZSQL Long query check_command check_nrpe!check_nz_longqry! notifications_enabled 0 }
  • 21. Netezza monitoring using Nagios Active Alerts - Perl programming using SQL.pm package • Invocation use lib "/nz/kit/share/perl"; use nz::SQL; • Package can only be used by the nz owner • NO username & password my ($KITDIR, $DATADIR); $DATADIR = "/nz/data.1.0"; $KITDIR = "/nz/kit"; nz::SQL::config(KITDIR => $KITDIR, DATADIR => $DATADIR); • Best practice - use alarm timers around SQL statements • Handy variables after each SQL execution $qresp->{nrows}, ncols, colid, qtype;
  • 22. Netezza monitoring using Nagios Perl programming using SQL.pm package (continued) • Interface example … nz::SQL::query($dbname, $sql). Unlike DBI the database must be called out every time you query. • Resultsets are not active in database (unlike DBI) they are in perl memory • Resultset traversal is done using perl foreach e.g. foreach my $row (@{$qresp->{data}}) { ($blocker_username,$blocker_sql,$blockee_username,$blockee_sql) = @$row; • Best practice: If you can avoid dealing with resultset and deal only with counts e.g (nrows). Most efficient use especially when dealing with a Nagios alert check that is going to occur several times a day.
  • 23. Future direction • Data graphing • Expand areas that we are monitoring for in Netezza • Integrate into a product offering (Observation Deck) from Brightlight that collects NZHIST for customer • Predict when we are going to outgrow our current processing and database needs
  • 24. Conclusion  Key takeaways are  Using Nagios can help your company have an extensible event monitor. Understanding Nagios architecture is important to a stable and working monitoring setup. Once you understand architecture setup writing an agent is trivial. If you can write SQL to detect an event then you can write an agent.  Other Reading materials or learning devices on this subject that you would like to share  URL’s provided in document have the recipe for how to setup Nagios, SNMP traps, and Netezza. Please visit those sites to get that info.
  • 26. Reference http://www.thegeekstuff.com/2010/08/monitoring-software-criteria/ http://exchange.nagios.org/directory/Tutorials/Install-and-Configure-NRPE-in- CentOS-and-Red-Hat/details http://www- 01.ibm.com/support/knowledgecenter/SSULQD_7.1.0/com.ibm.nz.portal.doc /c_portal_welcome.html http://www.networkworld.com/article/2329877/infrastructure-management/ how-to-quantify-downtime.html
  • 27. The End Frank Pantaleo fpantaleo@brightlightconsulting.com