Michael Medin's presentation on NSClient++. The presentation was given during the Nagios World Conference North America held Sept 27-29th, 2011 in Saint Paul, MN. For more information on the conference (including photos and videos), visit: http://go.nagios.com/nwcna
2. These slides represent the work and opinions of the author and do not constitute official positions of any organization sponsoring the authorâs work This material has not been peer reviewed and is presented here as-is with the permission of the author. The author assumes no liability for any content or opinion expressed in this presentation and or use of content herein. Disclaimer! It is not their fault! It is not my fault! It is your fault!
3. Developer (not manager) Not working with Nagios Accidentally ended up in our NOC Hated BB so we migrated to Nagios 2003: The birth of NSClient++ NSClient sucked (Broke Exchange) NRPE_NT was to much work 2004: The open source of NSClient++ âjust for funâ 2007: The rebirth of NSClient++ Got a lot of emails and hits on the webpage 2011: The Present 0.3.9 out last may 0.4.0 out as alfa My Background
4. Windows Monitoring and NSClient++ Quick Introduction Whatâs new in 0.3.9 Disk/File/* Scheduled Tasks Aliases Crash Handling Whatâs new in 0.4.0 New core Unix support New settings subsystem New protocol Python Scripting The end of NSClient++! Q/A Agenda
6. What is NSClient? A (pretty old) program pNSClient A (pretty limited) protocol check_nt A (pretty incorrect) concept âWindows monitoringâ What is it not? NSClient++! NSClient++ was written as a replacement for pNSClient But it has evolved much since then NSClient: Terminology
7. NSClient++ Freedom! Custom scripts Decentralized or centralized Active or Passive Can monitor âanythingâ (including your application) Can perform âtasksâ (fix your problems) Other options: SNMP Generally complex to use and limited on âstandardâ hardware pNSClient/NRPE_NT/OpMonAgent/* Old, outdated and usually limited functionality âAgentlessâ WMI Limited functionality Enforces centralized and active monitoring But... I am biased, so might not want to take my word for it... Why should you use NSClient++
9. Internals: C++ Around 75.000 lines of code Actively developed (unfortunately only by me) Modularized design (use what you need) Runs on: Windows: NT4, w2k, XP, w2k3, Vista, w2k8, X64, X86 ⌠Unix: Linux/Debian (probably many/most others as well) Current Version: 0.3.9 with 0.4.0 in beta Most features require NRPE or NSCA (or NSCP) Documentation online (WIKI) http://nsclient.org About NSClient++
10. Not supported by a commercial entity Donations welcome Sponsoring available (contact me for details) Used by a lot of people (I think) Impossible to estimate any figures Please, Help out! Add documentation Report problems Come with ideas, thoughts, etc⌠About NSClient++ (cont.)
13. NSClient++ is a command line program! nsclient++ -start (net start nsclientpp) nsclient++ -stop (net stop nsclientpp) nsclient++ -test Configuration: notepad nsc.ini Testing: Local (nsclient++ -test) From CLI (check_nrpe ...) From Nagios (add command) Works with âanythingâ Including many non Nagios based systems Using NSClient++ (0.3.9) nsclient++ -test Is your friend!
14. New command line syntax! nscp --service --start nscp --service â-stop nscp --help Testing nscp --test Configuration: nscp --settings-help nscp --settings --migrate-to ini nscp --settings --set ⌠⌠Run scripts: nscp --client --module PythonScript --command execute-and-load-python --script test.py --install Using NSClient++ (0.4.0) nscp --test Is your friend!
16. Major simplification to the disk/file checker CheckFile (removed) CheckFile2 Deprecated CheckFiles (replaces above) Volume support (for real this time) Aliases NSCA/NRPE enhancements Scheduled task checks Crash Handling A bunch of new commands Bug fixes and many more things⌠0.3.9 What's new: Overview
17. We have recruited a new member to the team! A girl actually⌠âŚStill a bit wet behind the ears⌠New team member!
20. The good: Powerfull interface! Simple to use! out-of-the-box solution! (on which you can expand) The bad: Nothing! Really, I mean it! âŚand then⌠yesterday⌠âŚin the bar⌠âŚall hopes shattered⌠âŚaparently it is still to complicated⌠ďďď Overview
21. Same as was introduced for eventlog last year Based on SQL WHERE clauses generated > -2d AND severity = 'errorâ size > 5k size > 5k OR size < 1k size > 5k AND written > -2d (size > 5k OR size < 1k ) AND written > -2d ⌠The new Filters
26. CheckDriveSize⌠CheckAll=volumes ⌠Other new features Added a new option to ignore drives which are not readable (like office 2010 q: drive) ignore-unreadable Added magic modifiers (from check_mk) magic=0.7 Volume support (for real this time)
28. Works the âsameâ as CheckEventLog âfilter=exit_code ne 0â Two modules: CheckTaskSched.dll Works on Windows NT4 and beyond But cannot check ânewâ tasks (from Vista and beyond) CheckTaskSched2.dll Works on Windows Vista and beyond Has fewer filter keywords Scheduled Tasks
32. System alias_cpu CPU Load past 5 minutes, 80/90% bounds alias_cpu_ex CPU Load past 5 minutes, custom bounds alias_mem Memory utilization (all) 80/90% bounds. alias_mem_ex Memory utilization (all), custom bounds alias_up System uptime Out of the box aliases
33. Disk/Drive alias_disk All fixed drives alias_disk_loose All fixed drives, ignore any problematic drives alias_volumes All volumes alias_volumes_loose All volumes, ignore any problematic drives alias_file_size Check the size of a given file (filename, size) alias_file_age Check the age of a given file Out of the box aliases (continued)
34. Eventlog alias_event_log Check for errors in the event log Schedules Tasks alias_sched_all No scheduled jobs have failed alias_sched_long No task has been running for longer then a given time. alias_sched_task Check if a given task succeeded Misc alias_updates Check that updates are applied Out of the box aliases (continued)
35. Processes alias_service All services in âsensible stateâ alias_service_ex All services in âsensible stateâ (exclude various services) alias_process A process must be running alias_process_stopped A process must not be running alias_process_count A process must not have more then X instances alias_process_hung A process must not be hung Out of the box aliases (continued)
37. Using Google break pad same as Google Chrome, Mozilla Firefox, etc Three options (not mutually exclusive) Send crash dumps to crash.nsclient.org Server can be changed if you want to have an internal server or proxy server. Store crash dumps for analysis Will also be checked with check_nscp Restart service Crash Handling
40. NSCA Fixed problems with sending âmanyâ results back NRPE Added support for large payloads Checks Added âcheck_nscpâ to check health of NSClient++ Added new check for running other checks âwith a timeoutâ Added new negate check (to negate the result of another check) All filters (read CheckEventLog et al) Many fixes and additions (regular expressions) Process checks Added support for checking if processes has âhungâ Performance data Added it to many places where it was intermittently missing before Other stuff (The highlights)
44. Brand new core based upon libraries Things should *work* not just âworkâ More modular and extensible Unix support Both as a client and server New settings subsystem Registry, improved ini support, http, etc New protocol NSCP (HTTP(s), MQ, Native) Distributed monitoring Many new things in this area (including MQ) Python scripting Primary goal (for me) is to create âunit-testâ Updated installer Wix 3.5, more customizable Whatâs new 0.4.0
45. âMonitoring Kitsâ Monitoring solutions for âstandard thingsâ New windows check-subsytem More modern and less arcane (no NT4 support) Remote checking .Net plugin support Possibly internal VBA scripting support Metrics cache and aggregation Lightweight version of CEP âcrit=cpu > 80% AND transactions_per_sec < 10â Whatâs coming 0.4.2
46. Filter-like API (in addition to options) âwarn=any drive > 90% OR c: > 80%â Remote updates/upgrades Allow NSCP to upgrade itself âportâ of the âstandard pluginsâ? Run your favorite check_xxx from inside NSClient++ Unix plugins? Run CheckCPU on unix machines? Client/web Interface? A nice little program (systray) Let me know what you would like to see! Whatmight be coming?
49. This is why it was so long in the making Merging each new version took forever! New internal protocol Removed all internal âlimitsâ (think buffer sizes) Allows many new features Allows much more advanced internal scripts Allows for ânon NRPE based checksâ A lot of new bugs? This is the scary part (for me) but my testing has show it seems very stable A completely new core
51. Good question⌠Since no one seems to like to program on Windows I brought NSClient++ to âunixâ ď Because I can With the new core comes portability So, perhaps the better question was: Why not? Will NOT be supported for some time though Unless someone wants to help out Why?!?!
53. Hierarchical settings subsystem [/settings/NRPE/server] allow arguments=false Instead of [NRPE Server] allow_arguments=false Why did I do this? Because it was fun ď Number of options has started to explode Simpler to use the registry (as well as xml?) Settings
54. Since settings have âurl:sâ old://${exe-path}/nsc.ini ini://${base-path}/nsclient.ini registry://HKEY_LOCAL_MACHINE/software/NSClient++ http://my.central.server/config/${hostname}.ini Allows extensions (not via plugins though) Maybe in the future: lua://${base-path}/config.lua python://${base-path}/config.py You can mix and match: ini://${base-path}/nsclient.ini Can âincludeâ: registry://HKEY_LOCAL_MACHINE/software/NSClient++ Which in turn includes http://conf.server/${hostname}.conf Whatâs in it for you?
55. Ability to load the same plugin twice. Normal (default alias is python) [/modules] PytonScript= [/settings/python/scripts] test.py Multiple modules (define two aliases foo and bar) [/modules] foo=PytonScript bar=PythonScript [/settings/foo/scripts] test1.py [/settings/bar/scripts] test2.py Multiple modules and alias
56. It depends⌠If you are âstillâ using check_nt: Probably not If you are using NSCA: Maybe not If you want to use all new features Yes How do I change? It is pretty simple⌠nscp --settings --migrate-to ini (or) nscp --settings --migrate-to registry Do I need to change?
60. Allows more then one command to be sent Used internally for plugins Support both passive and active checks Supports configuration, management, etc⌠Extensible But will also support: Multiple locales (based on utf) Unlimited payloads (soft configurable) Support real performance data (not strings) New protocol
64. an extension of the passive checks âSomethingâ can send notification events âSomethingâ can receive notification events Agents can forward notification events Replaces NSCAListenermodule Supports routing Not a one-to-one mapping. Multiple consumers multiple producers Allows Passive plugins (other then the built-in NSCA) Script and rule based routing Submissions and handlers
66. Built-in python scripting Has full API support Can build âmodulesâ in python Can access settings Can do âanythingâ Primarily used by me for unit-testing Requires a working python install Python Scripting
67. The end of NSClient++! Le Roi est mort, vive le Roi!
68. 0.4.x (ish) will be the last âWindowsâ monitoring agent The idea is to make it more: A platform/client/server for distributed monitoring Regardless of os/system Regardless of Monitoring solutions Donât worry⌠It will still work just fine as a âWindows Monitoring Agentâ But in addition to this you will be able to do more. So whats this all about?
70. Michael Medin michael@medin.name http://www.linkedin.com/in/mickem Information about NSClient++ http://nsclient.org Facebook: facebook.com/nsclient Slides, and examples http://nsclient.org/nscp/conferances/2011/nwcna/ Thank You!
Editor's Notes
Hello my name is Michael Medin.I am from Stockholm, Sweden.This is my second time here in Bolzano but this time I had less problems with my flights.This year I will speak a bit about what has happened in the last year.And hopefully for the last time I am speaking about âWindows Monitoringâ!If there are any questions or such just chime in.
Standard Disclaimer - My views (not anyone else's) - Not peer reviewed so I could be lying to you. - If you 2 billion dollar servers crash: life sucksLets simplify this a bitâŚ
Sorry, this slide just keep getting longer and longer... But I have actually removed some information tis timeâŚI am a developer and developers monitor software where as NOC monitors hardwareThe âunixâ guy quit and since I know âunixâ I apparently a good choice to administrate routers, firewalls and what not.Disliked BB so I devised a plan to migrate to Nagios.Best thing with Nagios was management loved SLA reporting!Once after some 30 or so installs of nsclient I went to the exchange server and:BANG! This was the birth of NSClient++.Management did not like crashing exchange servers!So we started looking at options and NRPE_NT was to hard to use for âsimpleâ checks. Initially we went with SNMP but soon started on NSClient++ instead.
Briefly the agenda covers short introduction to NSClient++Then we move on to 0.3.9 and whatâs new in the release.Following that is the 0.4.x version treeAnd finally we will have a QA session
A quick note on the terminology.The word NSClient can mean many things depending on what you are talking about
A quick summary of the options for monitoring Windows
If anyone has a Visual Studio 2005 âTeam Editionâ (with Itanium support) Iâm very very interestedWiki means YOU write the documentation.If the docs suck, you are to blame (not me)
I actually payed money to come here speaking with youBut I have always been strange that wayMight seem strange that there are twice as many downloads as unique visitors, but downloads are aggregated from other sites
Thank you to my sponsors
NSClient++ is your friend!Testing: do them in that order.I know people who start in Nagios and spend the next 3 days debugging, and think NSClient++ sucks.Had they start in NSClient++ /test it would have take 5 minutes and things would not have sucked!I donât like when things suck so......like net eye
NSClient++ is your friend!Testing: do them in that order.I know people who start in Nagios and spend the next 3 days debugging, and think NSClient++ sucks.Had they start in NSClient++ /test it would have take 5 minutes and things would not have sucked!I donât like when things suck so......like net eye
This is really really cool!(And the reason we are 3 months behind schedule, it was amazingly hard to do)
As I said NSCP is around 40k lines of code, this is around 4 so 10% of the code and it is new!
There are two severities I generally use the one called severity (Based upon eventID)
What might be interesting is the safe operators
An important note is how neg works with dates
The filter: There can be only one!Dont forget NRPE and NSCA has payload limits so exceeding them will cause errors
There are two severities I generally use the one called severity (Based upon eventID)
Parsing is pretty fancy.It will try to âdo things for youâBut what happened to neg?
Parsing is pretty fancy.It will try to âdo things for youâBut what happened to neg?
Parsing is pretty fancy.It will try to âdo things for youâBut what happened to neg?
Parsing is pretty fancy.It will try to âdo things for youâBut what happened to neg?
Boost means things (hopefully) works better0.4.x does not neccserily mean 0.4.0CEP CEP CEP!If yourload is high and youhavetransacationthis is a goodthing
Boost means things (hopefully) works better0.4.x does not neccserily mean 0.4.0CEP CEP CEP!If your load is high and you have transacation this is a good thing
Boost means things (hopefully) works better0.4.x does not neccserily mean 0.4.0CEP CEP CEP!If your load is high and you have transacation this is a good thing