SlideShare une entreprise Scribd logo
1  sur  54
Monitoring the User Experience
for Availability and Performance

             Nathan Vonnahme

        Nathan.vonnahme@bannerhealth.com
Hi from Fairbanks


 ~100,000 people




                    2012
Not very big


 Fairbanks Memorial Hospital
 • ~150 beds
 • ~1500 employees; ~40 IT dept.
 • around 300 production servers (~75% Vmware)...
   roughly 400 "apps“
 • 24x7x365.25 as only healthcare can be
 • Nagios monitors 113 hosts (includes some devices
   like switches and UPSes), but 442 services.
 • Nagios in the gaps

                          2012
Context

 • Last year: “Monitoring and Test Driven Development”




                            2012
Last year – central idea



 Monitoring tools are to the
 sysadmin what testing
 tools are to the developer.



                           2012
Last year




            2012
Last year: “Not a CI [Continuous Integration] tool”


    • Not as coupled to build processes
    • We don’t build most of our software!
    • Strengths in flexibility, alerting, production monitoring
    • Potentially weak in being IT-focused instead of
      customer-focused.




                              2012
QA and the Manufacturing Metaphor

 Testing the end product




                           2012
The main idea


 Can users actually use it?
 Does it suck?




                2012
Availability: Can users actually use it?


 If the database server won’t
    ping, that’s obviously
    important.
 If the network is down, duh.
 But is your monitoring
   coverage good enough to
   prove that your end
   product is delivering?




                                2012
Performance: Does it suck?


 Let’s be objective.


 The user doesn’t
  care why.




                         2012
It matters

              THIS IS days when I was happiest
 “I started to learn that the
    were the days withANof small successes and few
              NOT lots
            IPHONE 5!
    small frustrations.”
       –Joel Spolsky




                        2012
Two examples


 1. A mission-critical legacy Windows app with a
    complicated architecture
 2. A cooler modern web app




                         2012
LEGACY WINDOWS APP
Cerner Electronic Medical Record


 Note: Only fake patients in this presentation.
Complex architecture


 • SAN         s
 • AIX      • Citrix
 • Oracle   • Network
 • Middle- •   Desktop
   ware
           •   Every-
 • Another     thing x 2
   SAN         for High
               Avail-
 • VMWare
               ability
 • Window
A quick definition of Citrix




 The users are remote-controlling the application
  running on the Citrix server.
 The “client” runs on the Citrix server. The user’s
  actual workstation runs a “thin client”.
 90% or more of Cerner customers use Citrix.
 For comparison: old school “fat client”



                               2012
Complex support team




                       2012
Complex monitoring

 • Not just Nagios for monitoring.
 • Most things are well-monitored by Somebody
 • Certain things are “lost and found” and Nobody
   wants to touch them.




                          2012
Can the users actually use it?


 AutoIt lets us drive it like a user.
 1. Log in
 2. Search for a (fake) patient
 3. Open the patient’s chart.




                            2012
Do it live.




              2012
The Live Demo


 Interactive demo with a test patient in production.




                          2012
Runify


 "c:program
  filesnsclient++scriptscheck_pow
  erchart.exe" fat –w5,1,1 -c40,5,5

 • Compiled .exe
 • Use fat or citrix client
 • WARNING if more than 5,1,1 seconds for
   login, search, chart
 • CRITICAL over 40,5,5

                    2012
The AutoIt Script


 Open in SciTE and glance through.
 Note, Nagios.com has a couple of good articles by
  Sam Lansing about using AutoIt.
 Also see the presentation he did just before this
   one!




                          2012
NSCP config


 [/settings/external scripts/scripts]
 check_ctx = scriptscheck_powerchart.exe
   citrix -w20,5,5 -c40,5,5
 check_fat = scriptscheck_powerchart.exe
   fat -w20,5,5 -c40,5,5




                    2012
Gotchas: too many running at once


 Runs a KillHungProcesses() routine at
  beginning and end to clean them up.


 Also set max_check_commands=1 in config for this
   host.




                         2012
Gotchas: Occasional popups


 AdlibRegister("HandleRelationship")
 Func HandleRelationship()
   If WinActive("Assign a Relationship") Then
   Send("cc{ENTER}")
   EndIf
 EndFunc


 ...


 AdlibUnRegister("HandleRelationship")

                             2012
Gotchas: Stupid typer


 I repeatedly mistakenly added lines to the wrong
    section of the nsclient.ini file.




                         2012
Gotchas: Service vs. Interactive GUI




                           2012        29
Gotchas: AutoIt output to STDOUT


 You have to compile
   to .exe with
 “Console”
   checkbox or
 /console switch
   on commandline
   aut2exe compiler

 "%programfiles%Autoit3aut2exeaut2exe.exe" /in check_powerchart.au3
 /console /nopack


                                   2012
Gotchas: Annoying phantom icons in the tray




 Slow too!
 Solution: LP#TrayIconBuster




                         2012                 31
The results: Citrix


 1 week of 15
   minute
   samples




                      2012
The results:
                  Fat client

       Note the much
         different Y axis
         scale




2012
Citrix vs Fat: icon to login (zoomed to 1.5 days)




                            2012
Citrix vs Fat: open one patient’s chart




                            2012
The advantages


 • Continuous monitoring of the end product
 • Nagios alerting, escalating, whatever.
 • Control/experiment
   • Just the two (Citrix vs fat) tell us a lot
   • We have a historical benchmark to compare anomalies
     with
   • We’ve already identified a symptomatic Citrix problem
     we can treat with a Nagios event handler.



                               2012
Still TODO:


 • Thresholds don’t yet affect OK/WARNING/CRITICAL
 • Thresholds in perfdata
 • Make it less noisy for less ignorable notifications
 • Make it run more often with tighter thresholds now
   that we have a benchmark.
 • Move it to a dedicated VM/user accounts
 • More experiment groups
     • No antivirus
     • No Citrix printers
     • No VMWare (fat, no Citrix)


                               2012
Questions/comments about this example?




                        2012
MODERN WEB APP



2012
Demo


fairbanks.bannerhealth.com/
  cpoe_ordersets

It’s not a fair comparison, just an example.
Modern == Ajax


 Notice there’s not a full page
  load.


 How do we tell if users can
  actually use this?
 How can we tell if it sucks?




                           2012
Options: Perl/PHP


                    I’ve done this… Symfony
                       functional scripts, for
                       example.
                    They’ll never do JavaScript
                     or AJAX.
                    I worry it won’t match the
                      top level, what the user
                      sees.



                     2012
Options: Selenium


 • Runs a real browser or browsers, kind of like
   AutoIt
 • Other people are using it quite successfully (e.g.
   Nathan Broderick 2011; Sam Lansing 2012)
 • I have not gotten into it… images of too many
   browser windows?
 • Not very mobile-like




                           2012
PhantomJS


 Headless webkit


   As of July 2012 [webkit] has the most market
    share of any layout engine at over 40% of
    the browser market share according
    to StatCounter.

                               -- http://en.wikipedia.org/wiki/WebKit




                        2012
PhantomJS


 • Allows you to write client scripts in JavaScript,
   executed by its own fast V8 engine
   • “Always bet on JS” – Brendan Eich, original author of
     JavaScript
   • JavaScript is funny in that most people don’t actually
     learn it before they start using it. – Douglas Crockford
     (paraphrase), author of JavaScript: The Good Parts




                              2012
CasperJS


 Nicer JS API on top of PhantomJS




                         2012
SHOW ME THE CODEZ


 Quick Walkthrough in Aquamacs or Gist
 Demo failure + screenshot




                        2012
Works on Nagios XI host


 • A little bit heavy (it is firing up a WebKit)
 • Zombiejs looks promising as a lighter (pure JS in
   Node) JS-capable headless browser
 • Quick CasperJS install instructions on my blog,
      n8v.enteuxis.org




                            2012
The results – 10 minute interval


 Boring… note total amplitude.




                           2012
The benefits


 • Ongoing assurance that it is basically usable
 • If it’s usable, all its layers must be working.
 • Benchmarked performance so we can evaluate
   the end result of changes




                           2012
Still TODO


 • Check timestamp at bottom of page
 • Run the same checks on an internal development
   version
 • Upgrade jQuery UI and see if performance
   changes
 • Check from an Internet host




                         2012
Main idea again


 Can the users actually use it?
 Does it suck?




                  2012
Concluding Questions/comments


 JS Sample Code at
  gist.github.com/n8v

 Email me if you want the AutoIt code.




                         2012
Sneak peek


 ./check_facebook_friends.js -u
   nathan.vonnahme -w @202 -c @203


 Come to my other talk, “Writing Custom Nagios
   Plugins” Friday morning to see how it works.




                        2012

Contenu connexe

Tendances

Front-End Modernization for Mortals
Front-End Modernization for MortalsFront-End Modernization for Mortals
Front-End Modernization for Mortals
cgack
 
10 Reasons Your Software Sucks - Election 2012 Edition
10 Reasons Your Software Sucks - Election 2012 Edition10 Reasons Your Software Sucks - Election 2012 Edition
10 Reasons Your Software Sucks - Election 2012 Edition
Caleb Jenkins
 
Sustainable Agile Development
Sustainable Agile DevelopmentSustainable Agile Development
Sustainable Agile Development
Gabriele Lana
 

Tendances (20)

Front-End Modernization for Mortals
Front-End Modernization for MortalsFront-End Modernization for Mortals
Front-End Modernization for Mortals
 
Lean engineering for lean/balanced teams: lessons learned (and still learning...
Lean engineering for lean/balanced teams: lessons learned (and still learning...Lean engineering for lean/balanced teams: lessons learned (and still learning...
Lean engineering for lean/balanced teams: lessons learned (and still learning...
 
Orchestration: Fancy Buzzword, or the Inevitable fate of Docker Containers?
Orchestration: Fancy Buzzword, or the Inevitable fate of Docker Containers?Orchestration: Fancy Buzzword, or the Inevitable fate of Docker Containers?
Orchestration: Fancy Buzzword, or the Inevitable fate of Docker Containers?
 
10 Reasons Your Software Sucks 2014 - Tax Day Edition!
10 Reasons Your Software Sucks 2014 - Tax Day Edition!10 Reasons Your Software Sucks 2014 - Tax Day Edition!
10 Reasons Your Software Sucks 2014 - Tax Day Edition!
 
The Lean Tech Stack
The Lean Tech StackThe Lean Tech Stack
The Lean Tech Stack
 
Selecting the Best Javascript Web Framework
Selecting the Best Javascript Web FrameworkSelecting the Best Javascript Web Framework
Selecting the Best Javascript Web Framework
 
Web Test Automation Framework - IndicThreads Conference
Web Test Automation Framework  - IndicThreads ConferenceWeb Test Automation Framework  - IndicThreads Conference
Web Test Automation Framework - IndicThreads Conference
 
10 Reasons Your Software Sucks - Election 2012 Edition
10 Reasons Your Software Sucks - Election 2012 Edition10 Reasons Your Software Sucks - Election 2012 Edition
10 Reasons Your Software Sucks - Election 2012 Edition
 
Enabling Lean with Tech: lessons learned applying lean at paypal
Enabling Lean with Tech: lessons learned applying lean at paypalEnabling Lean with Tech: lessons learned applying lean at paypal
Enabling Lean with Tech: lessons learned applying lean at paypal
 
Integrating Quality into Project Portfolio Management
Integrating Quality into Project Portfolio ManagementIntegrating Quality into Project Portfolio Management
Integrating Quality into Project Portfolio Management
 
Building Rich User Experiences Without JavaScript Spaghetti
Building Rich User Experiences Without JavaScript SpaghettiBuilding Rich User Experiences Without JavaScript Spaghetti
Building Rich User Experiences Without JavaScript Spaghetti
 
DevOps Beyond the Buzzwords: What it Means to Embrace the DevOps Lifestyle
DevOps Beyond the Buzzwords: What it Means to Embrace the DevOps LifestyleDevOps Beyond the Buzzwords: What it Means to Embrace the DevOps Lifestyle
DevOps Beyond the Buzzwords: What it Means to Embrace the DevOps Lifestyle
 
Chaos Engineering Without Observability ... Is Just Chaos
Chaos Engineering Without Observability ... Is Just ChaosChaos Engineering Without Observability ... Is Just Chaos
Chaos Engineering Without Observability ... Is Just Chaos
 
Top 50 Node.js Interview Questions and Answers | Edureka
Top 50 Node.js Interview Questions and Answers | EdurekaTop 50 Node.js Interview Questions and Answers | Edureka
Top 50 Node.js Interview Questions and Answers | Edureka
 
Infrastructure automation-in-the-cloud-130613045624-phpapp02
Infrastructure automation-in-the-cloud-130613045624-phpapp02Infrastructure automation-in-the-cloud-130613045624-phpapp02
Infrastructure automation-in-the-cloud-130613045624-phpapp02
 
Adapting Deployment Pipelines for Complex Applications
Adapting Deployment Pipelines for Complex ApplicationsAdapting Deployment Pipelines for Complex Applications
Adapting Deployment Pipelines for Complex Applications
 
Sustainable Agile Development
Sustainable Agile DevelopmentSustainable Agile Development
Sustainable Agile Development
 
Quo vadis, JavaScript? Devday.pl keynote
Quo vadis, JavaScript? Devday.pl keynoteQuo vadis, JavaScript? Devday.pl keynote
Quo vadis, JavaScript? Devday.pl keynote
 
Overview of DroidCon UK 2015
Overview of DroidCon UK 2015 Overview of DroidCon UK 2015
Overview of DroidCon UK 2015
 
QuickBooks Desktop Accessibility - How we did it.
QuickBooks Desktop Accessibility - How we did it.QuickBooks Desktop Accessibility - How we did it.
QuickBooks Desktop Accessibility - How we did it.
 

En vedette

Sensu at brightpearl
Sensu at brightpearlSensu at brightpearl
Sensu at brightpearl
David Tibbs
 
Internet Programming With Python Presentation
Internet Programming With Python PresentationInternet Programming With Python Presentation
Internet Programming With Python Presentation
AkramWaseem
 
Networking in OpenStack for non-networking people: Neutron, Open vSwitch and ...
Networking in OpenStack for non-networking people: Neutron, Open vSwitch and ...Networking in OpenStack for non-networking people: Neutron, Open vSwitch and ...
Networking in OpenStack for non-networking people: Neutron, Open vSwitch and ...
Dave Neary
 
Openstack Neutron and SDN
Openstack Neutron and SDNOpenstack Neutron and SDN
Openstack Neutron and SDN
inakipascual
 

En vedette (20)

Sensu at brightpearl
Sensu at brightpearlSensu at brightpearl
Sensu at brightpearl
 
Nagios Conference 2013 - Troy Lea - Leveraging and Understanding Performance ...
Nagios Conference 2013 - Troy Lea - Leveraging and Understanding Performance ...Nagios Conference 2013 - Troy Lea - Leveraging and Understanding Performance ...
Nagios Conference 2013 - Troy Lea - Leveraging and Understanding Performance ...
 
Internet Programming With Python Presentation
Internet Programming With Python PresentationInternet Programming With Python Presentation
Internet Programming With Python Presentation
 
Nagios Conference 2014 - Troy Lea - Monitoring VMware Virtualization Using vMA
Nagios Conference 2014 - Troy Lea - Monitoring VMware Virtualization Using vMANagios Conference 2014 - Troy Lea - Monitoring VMware Virtualization Using vMA
Nagios Conference 2014 - Troy Lea - Monitoring VMware Virtualization Using vMA
 
Nagios Conference 2011 - William Leibzon - Nagios In Cloud Computing Environm...
Nagios Conference 2011 - William Leibzon - Nagios In Cloud Computing Environm...Nagios Conference 2011 - William Leibzon - Nagios In Cloud Computing Environm...
Nagios Conference 2011 - William Leibzon - Nagios In Cloud Computing Environm...
 
Python quickstart for programmers: Python Kung Fu
Python quickstart for programmers: Python Kung FuPython quickstart for programmers: Python Kung Fu
Python quickstart for programmers: Python Kung Fu
 
Nagios Conference 2012 - Troy Lea - Custom Wizards, Components and Dashlets i...
Nagios Conference 2012 - Troy Lea - Custom Wizards, Components and Dashlets i...Nagios Conference 2012 - Troy Lea - Custom Wizards, Components and Dashlets i...
Nagios Conference 2012 - Troy Lea - Custom Wizards, Components and Dashlets i...
 
Nagios Conference 2014 - Shamas Demoret - Getting Started With Nagios XI
Nagios Conference 2014 - Shamas Demoret - Getting Started With Nagios XINagios Conference 2014 - Shamas Demoret - Getting Started With Nagios XI
Nagios Conference 2014 - Shamas Demoret - Getting Started With Nagios XI
 
Nagios Conference 2011 - Nicholas Scott - Nagios Performance Tuning
Nagios Conference 2011 - Nicholas Scott - Nagios Performance TuningNagios Conference 2011 - Nicholas Scott - Nagios Performance Tuning
Nagios Conference 2011 - Nicholas Scott - Nagios Performance Tuning
 
Trevor McDonald - Nagios XI Under The Hood
Trevor McDonald  - Nagios XI Under The HoodTrevor McDonald  - Nagios XI Under The Hood
Trevor McDonald - Nagios XI Under The Hood
 
Bridging The Gap: Explaining OpenStack To VMware Administrators
Bridging The Gap: Explaining OpenStack To VMware AdministratorsBridging The Gap: Explaining OpenStack To VMware Administrators
Bridging The Gap: Explaining OpenStack To VMware Administrators
 
Automated Security Hardening with OpenStack-Ansible
Automated Security Hardening with OpenStack-AnsibleAutomated Security Hardening with OpenStack-Ansible
Automated Security Hardening with OpenStack-Ansible
 
Training Ensimag OpenStack 2016
Training Ensimag OpenStack 2016Training Ensimag OpenStack 2016
Training Ensimag OpenStack 2016
 
OpenStack + VMware: Everything You Need to Know (Kilo-edition)
OpenStack + VMware: Everything You Need to Know (Kilo-edition)OpenStack + VMware: Everything You Need to Know (Kilo-edition)
OpenStack + VMware: Everything You Need to Know (Kilo-edition)
 
Stop using Nagios (so it can die peacefully)
Stop using Nagios (so it can die peacefully)Stop using Nagios (so it can die peacefully)
Stop using Nagios (so it can die peacefully)
 
Networking in OpenStack for non-networking people: Neutron, Open vSwitch and ...
Networking in OpenStack for non-networking people: Neutron, Open vSwitch and ...Networking in OpenStack for non-networking people: Neutron, Open vSwitch and ...
Networking in OpenStack for non-networking people: Neutron, Open vSwitch and ...
 
Openstack Neutron and SDN
Openstack Neutron and SDNOpenstack Neutron and SDN
Openstack Neutron and SDN
 
OpenStack Neutron Tutorial
OpenStack Neutron TutorialOpenStack Neutron Tutorial
OpenStack Neutron Tutorial
 
Ubuntu – Linux Useful Commands
Ubuntu – Linux Useful CommandsUbuntu – Linux Useful Commands
Ubuntu – Linux Useful Commands
 
(SCALE 12x) OpenStack vs. VMware - A System Administrator Perspective
(SCALE 12x) OpenStack vs. VMware - A System Administrator Perspective(SCALE 12x) OpenStack vs. VMware - A System Administrator Perspective
(SCALE 12x) OpenStack vs. VMware - A System Administrator Perspective
 

Similaire à Nagios Conference 2012 - Nathan Vonnahme - Monitoring the User Experience

Gatling - Bordeaux JUG
Gatling - Bordeaux JUGGatling - Bordeaux JUG
Gatling - Bordeaux JUG
slandelle
 
Schneider Electric Scada Global Support Provides Troubleshooting and Technica...
Schneider Electric Scada Global Support Provides Troubleshooting and Technica...Schneider Electric Scada Global Support Provides Troubleshooting and Technica...
Schneider Electric Scada Global Support Provides Troubleshooting and Technica...
Preeya Selvarajah
 

Similaire à Nagios Conference 2012 - Nathan Vonnahme - Monitoring the User Experience (20)

IBM and Node.js - Old Doge, New Tricks
IBM and Node.js - Old Doge, New TricksIBM and Node.js - Old Doge, New Tricks
IBM and Node.js - Old Doge, New Tricks
 
Nagios Conference 2012 - Sam Lansing - Automating Windows Application Testing...
Nagios Conference 2012 - Sam Lansing - Automating Windows Application Testing...Nagios Conference 2012 - Sam Lansing - Automating Windows Application Testing...
Nagios Conference 2012 - Sam Lansing - Automating Windows Application Testing...
 
Introduction of operating system
Introduction of operating systemIntroduction of operating system
Introduction of operating system
 
Real World Windows 8 Apps in JavaScript
Real World Windows 8 Apps in JavaScriptReal World Windows 8 Apps in JavaScript
Real World Windows 8 Apps in JavaScript
 
A modern architecturereview–usingcodereviewtools-ver-3.5
A modern architecturereview–usingcodereviewtools-ver-3.5A modern architecturereview–usingcodereviewtools-ver-3.5
A modern architecturereview–usingcodereviewtools-ver-3.5
 
Building Mobile Web Apps with jQM and Cordova on Azure
Building Mobile Web Apps with jQM and Cordova on AzureBuilding Mobile Web Apps with jQM and Cordova on Azure
Building Mobile Web Apps with jQM and Cordova on Azure
 
Enterprise PHP (PHP London Conference 2008)
Enterprise PHP (PHP London Conference 2008)Enterprise PHP (PHP London Conference 2008)
Enterprise PHP (PHP London Conference 2008)
 
Front end-modernization
Front end-modernizationFront end-modernization
Front end-modernization
 
Front end-modernization
Front end-modernizationFront end-modernization
Front end-modernization
 
Wessel Loth - Fire your Frontend Framework with Lit - TEQnation 2022.pdf
Wessel Loth - Fire your Frontend Framework with Lit - TEQnation 2022.pdfWessel Loth - Fire your Frontend Framework with Lit - TEQnation 2022.pdf
Wessel Loth - Fire your Frontend Framework with Lit - TEQnation 2022.pdf
 
Gatling - Bordeaux JUG
Gatling - Bordeaux JUGGatling - Bordeaux JUG
Gatling - Bordeaux JUG
 
Windows 8 Introduction
Windows 8 IntroductionWindows 8 Introduction
Windows 8 Introduction
 
Devops for drupal
Devops for  drupalDevops for  drupal
Devops for drupal
 
Building a µservice with Kotlin, Micronaut & GCP
Building a µservice with Kotlin, Micronaut & GCPBuilding a µservice with Kotlin, Micronaut & GCP
Building a µservice with Kotlin, Micronaut & GCP
 
Sync Workitems between multiple Team Projects #vssatpn
Sync Workitems between multiple Team Projects #vssatpnSync Workitems between multiple Team Projects #vssatpn
Sync Workitems between multiple Team Projects #vssatpn
 
Making Strongly-typed NETCONF Usable
Making Strongly-typed NETCONF UsableMaking Strongly-typed NETCONF Usable
Making Strongly-typed NETCONF Usable
 
Schneider Electric Scada Global Support Provides Troubleshooting and Technica...
Schneider Electric Scada Global Support Provides Troubleshooting and Technica...Schneider Electric Scada Global Support Provides Troubleshooting and Technica...
Schneider Electric Scada Global Support Provides Troubleshooting and Technica...
 
State of jQuery June 2013 - Portland
State of jQuery June 2013 - PortlandState of jQuery June 2013 - Portland
State of jQuery June 2013 - Portland
 
O'Reilly webcast: Joshua Bixby on Mobile Performance Trends and Predictions
O'Reilly webcast: Joshua Bixby on Mobile Performance Trends and PredictionsO'Reilly webcast: Joshua Bixby on Mobile Performance Trends and Predictions
O'Reilly webcast: Joshua Bixby on Mobile Performance Trends and Predictions
 
02 Node introduction
02 Node introduction02 Node introduction
02 Node introduction
 

Plus de Nagios

Plus de Nagios (20)

Nagios XI Best Practices
Nagios XI Best PracticesNagios XI Best Practices
Nagios XI Best Practices
 
Jesse Olson - Nagios Log Server Architecture Overview
Jesse Olson - Nagios Log Server Architecture OverviewJesse Olson - Nagios Log Server Architecture Overview
Jesse Olson - Nagios Log Server Architecture Overview
 
Sean Falzon - Nagios - Resilient Notifications
Sean Falzon - Nagios - Resilient NotificationsSean Falzon - Nagios - Resilient Notifications
Sean Falzon - Nagios - Resilient Notifications
 
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise Edition
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise EditionMarcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise Edition
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise Edition
 
Janice Singh - Writing Custom Nagios Plugins
Janice Singh - Writing Custom Nagios PluginsJanice Singh - Writing Custom Nagios Plugins
Janice Singh - Writing Custom Nagios Plugins
 
Dave Williams - Nagios Log Server - Practical Experience
Dave Williams - Nagios Log Server - Practical ExperienceDave Williams - Nagios Log Server - Practical Experience
Dave Williams - Nagios Log Server - Practical Experience
 
Mike Weber - Nagios and Group Deployment of Service Checks
Mike Weber - Nagios and Group Deployment of Service ChecksMike Weber - Nagios and Group Deployment of Service Checks
Mike Weber - Nagios and Group Deployment of Service Checks
 
Mike Guthrie - Revamping Your 10 Year Old Nagios Installation
Mike Guthrie - Revamping Your 10 Year Old Nagios InstallationMike Guthrie - Revamping Your 10 Year Old Nagios Installation
Mike Guthrie - Revamping Your 10 Year Old Nagios Installation
 
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
 
Matt Bruzek - Monitoring Your Public Cloud With Nagios
Matt Bruzek - Monitoring Your Public Cloud With NagiosMatt Bruzek - Monitoring Your Public Cloud With Nagios
Matt Bruzek - Monitoring Your Public Cloud With Nagios
 
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
 
Eric Loyd - Fractal Nagios
Eric Loyd - Fractal NagiosEric Loyd - Fractal Nagios
Eric Loyd - Fractal Nagios
 
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
 
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
 
Nagios World Conference 2015 - Scott Wilkerson Opening
Nagios World Conference 2015 - Scott Wilkerson OpeningNagios World Conference 2015 - Scott Wilkerson Opening
Nagios World Conference 2015 - Scott Wilkerson Opening
 
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios CoreNrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
 
Nagios Log Server - Features
Nagios Log Server - FeaturesNagios Log Server - Features
Nagios Log Server - Features
 
Nagios Network Analyzer - Features
Nagios Network Analyzer - FeaturesNagios Network Analyzer - Features
Nagios Network Analyzer - Features
 
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing NagiosNagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
 
Nagios Conference 2014 - Mike Weber - Nagios Rapid Deployment Options
Nagios Conference 2014 - Mike Weber - Nagios Rapid Deployment OptionsNagios Conference 2014 - Mike Weber - Nagios Rapid Deployment Options
Nagios Conference 2014 - Mike Weber - Nagios Rapid Deployment Options
 

Dernier

Dernier (20)

Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 

Nagios Conference 2012 - Nathan Vonnahme - Monitoring the User Experience

  • 1. Monitoring the User Experience for Availability and Performance Nathan Vonnahme Nathan.vonnahme@bannerhealth.com
  • 2. Hi from Fairbanks ~100,000 people 2012
  • 3. Not very big Fairbanks Memorial Hospital • ~150 beds • ~1500 employees; ~40 IT dept. • around 300 production servers (~75% Vmware)... roughly 400 "apps“ • 24x7x365.25 as only healthcare can be • Nagios monitors 113 hosts (includes some devices like switches and UPSes), but 442 services. • Nagios in the gaps 2012
  • 4. Context • Last year: “Monitoring and Test Driven Development” 2012
  • 5. Last year – central idea Monitoring tools are to the sysadmin what testing tools are to the developer. 2012
  • 6. Last year 2012
  • 7. Last year: “Not a CI [Continuous Integration] tool” • Not as coupled to build processes • We don’t build most of our software! • Strengths in flexibility, alerting, production monitoring • Potentially weak in being IT-focused instead of customer-focused. 2012
  • 8. QA and the Manufacturing Metaphor Testing the end product 2012
  • 9. The main idea Can users actually use it? Does it suck? 2012
  • 10. Availability: Can users actually use it? If the database server won’t ping, that’s obviously important. If the network is down, duh. But is your monitoring coverage good enough to prove that your end product is delivering? 2012
  • 11. Performance: Does it suck? Let’s be objective. The user doesn’t care why. 2012
  • 12. It matters THIS IS days when I was happiest “I started to learn that the were the days withANof small successes and few NOT lots IPHONE 5! small frustrations.” –Joel Spolsky 2012
  • 13. Two examples 1. A mission-critical legacy Windows app with a complicated architecture 2. A cooler modern web app 2012
  • 15. Cerner Electronic Medical Record Note: Only fake patients in this presentation.
  • 16. Complex architecture • SAN s • AIX • Citrix • Oracle • Network • Middle- • Desktop ware • Every- • Another thing x 2 SAN for High Avail- • VMWare ability • Window
  • 17. A quick definition of Citrix The users are remote-controlling the application running on the Citrix server. The “client” runs on the Citrix server. The user’s actual workstation runs a “thin client”. 90% or more of Cerner customers use Citrix. For comparison: old school “fat client” 2012
  • 19. Complex monitoring • Not just Nagios for monitoring. • Most things are well-monitored by Somebody • Certain things are “lost and found” and Nobody wants to touch them. 2012
  • 20. Can the users actually use it? AutoIt lets us drive it like a user. 1. Log in 2. Search for a (fake) patient 3. Open the patient’s chart. 2012
  • 21. Do it live. 2012
  • 22. The Live Demo Interactive demo with a test patient in production. 2012
  • 23. Runify "c:program filesnsclient++scriptscheck_pow erchart.exe" fat –w5,1,1 -c40,5,5 • Compiled .exe • Use fat or citrix client • WARNING if more than 5,1,1 seconds for login, search, chart • CRITICAL over 40,5,5 2012
  • 24. The AutoIt Script Open in SciTE and glance through. Note, Nagios.com has a couple of good articles by Sam Lansing about using AutoIt. Also see the presentation he did just before this one! 2012
  • 25. NSCP config [/settings/external scripts/scripts] check_ctx = scriptscheck_powerchart.exe citrix -w20,5,5 -c40,5,5 check_fat = scriptscheck_powerchart.exe fat -w20,5,5 -c40,5,5 2012
  • 26. Gotchas: too many running at once Runs a KillHungProcesses() routine at beginning and end to clean them up. Also set max_check_commands=1 in config for this host. 2012
  • 27. Gotchas: Occasional popups AdlibRegister("HandleRelationship") Func HandleRelationship() If WinActive("Assign a Relationship") Then Send("cc{ENTER}") EndIf EndFunc ... AdlibUnRegister("HandleRelationship") 2012
  • 28. Gotchas: Stupid typer I repeatedly mistakenly added lines to the wrong section of the nsclient.ini file. 2012
  • 29. Gotchas: Service vs. Interactive GUI 2012 29
  • 30. Gotchas: AutoIt output to STDOUT You have to compile to .exe with “Console” checkbox or /console switch on commandline aut2exe compiler "%programfiles%Autoit3aut2exeaut2exe.exe" /in check_powerchart.au3 /console /nopack 2012
  • 31. Gotchas: Annoying phantom icons in the tray Slow too! Solution: LP#TrayIconBuster 2012 31
  • 32. The results: Citrix 1 week of 15 minute samples 2012
  • 33. The results: Fat client Note the much different Y axis scale 2012
  • 34. Citrix vs Fat: icon to login (zoomed to 1.5 days) 2012
  • 35. Citrix vs Fat: open one patient’s chart 2012
  • 36. The advantages • Continuous monitoring of the end product • Nagios alerting, escalating, whatever. • Control/experiment • Just the two (Citrix vs fat) tell us a lot • We have a historical benchmark to compare anomalies with • We’ve already identified a symptomatic Citrix problem we can treat with a Nagios event handler. 2012
  • 37. Still TODO: • Thresholds don’t yet affect OK/WARNING/CRITICAL • Thresholds in perfdata • Make it less noisy for less ignorable notifications • Make it run more often with tighter thresholds now that we have a benchmark. • Move it to a dedicated VM/user accounts • More experiment groups • No antivirus • No Citrix printers • No VMWare (fat, no Citrix) 2012
  • 40. Demo fairbanks.bannerhealth.com/ cpoe_ordersets It’s not a fair comparison, just an example.
  • 41. Modern == Ajax Notice there’s not a full page load. How do we tell if users can actually use this? How can we tell if it sucks? 2012
  • 42. Options: Perl/PHP I’ve done this… Symfony functional scripts, for example. They’ll never do JavaScript or AJAX. I worry it won’t match the top level, what the user sees. 2012
  • 43. Options: Selenium • Runs a real browser or browsers, kind of like AutoIt • Other people are using it quite successfully (e.g. Nathan Broderick 2011; Sam Lansing 2012) • I have not gotten into it… images of too many browser windows? • Not very mobile-like 2012
  • 44. PhantomJS Headless webkit As of July 2012 [webkit] has the most market share of any layout engine at over 40% of the browser market share according to StatCounter. -- http://en.wikipedia.org/wiki/WebKit 2012
  • 45. PhantomJS • Allows you to write client scripts in JavaScript, executed by its own fast V8 engine • “Always bet on JS” – Brendan Eich, original author of JavaScript • JavaScript is funny in that most people don’t actually learn it before they start using it. – Douglas Crockford (paraphrase), author of JavaScript: The Good Parts 2012
  • 46. CasperJS Nicer JS API on top of PhantomJS 2012
  • 47. SHOW ME THE CODEZ Quick Walkthrough in Aquamacs or Gist Demo failure + screenshot 2012
  • 48. Works on Nagios XI host • A little bit heavy (it is firing up a WebKit) • Zombiejs looks promising as a lighter (pure JS in Node) JS-capable headless browser • Quick CasperJS install instructions on my blog, n8v.enteuxis.org 2012
  • 49. The results – 10 minute interval Boring… note total amplitude. 2012
  • 50. The benefits • Ongoing assurance that it is basically usable • If it’s usable, all its layers must be working. • Benchmarked performance so we can evaluate the end result of changes 2012
  • 51. Still TODO • Check timestamp at bottom of page • Run the same checks on an internal development version • Upgrade jQuery UI and see if performance changes • Check from an Internet host 2012
  • 52. Main idea again Can the users actually use it? Does it suck? 2012
  • 53. Concluding Questions/comments JS Sample Code at gist.github.com/n8v Email me if you want the AutoIt code. 2012
  • 54. Sneak peek ./check_facebook_friends.js -u nathan.vonnahme -w @202 -c @203 Come to my other talk, “Writing Custom Nagios Plugins” Friday morning to see how it works. 2012

Notes de l'éditeur

  1. Photos: http://www.newsminer.com/pages/features_our_town/
  2. Photo: http://www.uaf.edu/files/news/featured/03/tvc/index.html
  3. Image: http://availabilityadvisor.com/2012/01/10/how-much-availability-is-enough/
  4. Les Claypool:Primus performancesucks! Image http://www.thehazardreport.com/2012/09/primus-sucks-at-club-nokia-part-ii-2010.html
  5. Quote from “Controlling Your Environment Makes You Happy”http://www.joelonsoftware.com/uibook/chapters/fog0000000057.html----- Meeting Notes (9/24/12 15:05) -----THIS IS NOT AN IPHONE 5!
  6. Image: http://www.couriermail.com.au/spike/columnists/blame-game-in-full-swing-amid-election-2010-fallout/story-e6frerex-1225911543729
  7. Img http://axtonlennox111.deviantart.com/art/1-2-3-NOT-IT-dam-242775890
  8. AdlibRegister basically runs every quarter second until unregistered.
  9. Img http://livelovelaugh-lace1013.blogspot.com/2008_10_01_archive.html
  10. This is why the nagios.com article uses notepad to write output to a file… I guess it works!
  11. This Citrix performance is what we actually deliver to our users. Note how much more erratic it is too.
  12. Much more similar Y scale– the performance doesn’t change that much.This makes the spikes much more interesting. What was going on with the backend on Thursday?
  13. ----- Meeting Notes (9/24/12 15:05) -----need darker background