SlideShare a Scribd company logo
1 of 61
Automatic Remediation &
Superfluous Ticket Elimination
Presentation Outline
Who’s That Guy?
Eliminating Superfluous Tickets
Automatic Remediation Building
Blocks & Example Cases
Actionable Intelligence
Additional Resources &
Brief Q&A
Who’s That Guy?
• Brian Dagan from
• Based in Chantilly, VA
• Approximately 5,400 endpoints
Presentation Outline
Who’s That Guy?
Eliminating Superfluous Tickets
Automatic Remediation Building
Blocks & Example Cases
Actionable Intelligence
Additional Resources &
Brief Q&A
Superfluous Tickets
/so͞oˈpər-fləəs/
• More than is sufficient or required
• Unnecessary or needless
Source: http://goo.gl/SYoDK
Superfluous Tickets Eliminated
2011-06-10 through 2013-04-01
$45,000/year
(Source: http://www.indeed.com)
50 workweeks/year
40 hours/week
2,000 hours/year
$22.50/hour
75% efficiency (25% Reddit )
$30/hour
5 minutes/ticket
x 189 tickets/day
15.75 hours/day
= 2+ Engineers (~5,000 Agents)
Eliminating Superfluous Tickets
• Rebuilt Event Sets, Monitor Sets
• Implemented Policy Management
• Wrote Automatic Remediation Agent
Procedures for:
– Hard disk errors
– Low drive space
– Service stoppages
– Anti-virus removal & installation
– Inheritance-based Policy application
Eliminating Superfluous Tickets
Rebuilding Event Sets – Best Practices
• Name your Event Sets for the severity
– Systems Management Pack does this:
Eliminating Superfluous Tickets
Rebuilding Event Sets – Best Practices
• Use EID, Source & Description if possible
– Systems Management Pack does this:
Eliminating Superfluous Tickets
Rebuilding Event Sets – Best Practices
• Use EID, Source & Description if possible
– Systems Management Pack also… doesn’t:
– Be aware of why this is a problem!
Eliminating Superfluous Tickets
Rebuilding Event Sets – Best Practices
• What’s worth waking up an Engineer?
– “Critical” priority
• What’s going to ruin someone’s day?
– “High” priority
• What needs addressed in a day or so?
– “Monitoring” priority
• What’s going to get you sued?
– “Auditing” priority
Eliminating Superfluous Tickets
Rebuilding Event Sets – Best Practices
• Removing superfluous Events:
– Which ticket queue was the Alert in?
• Locate the Event Set…
– What was the Event ID?
• Locate the exact Event in the Event Set…
– What’s *unique* about the Alert
• Modify the Event (or add another Event with
specific info to match the superfluous alert
ticket, and set it to Ignore):
Eliminating Superfluous Tickets
Are you pondering what I’m pondering, Pinky?
• Before you add an Event… think!
Can I do something
more intelligent
with this alert?
Presentation Outline
Who’s That Guy?
Eliminating Superfluous Tickets
Automatic Remediation Building
Blocks & Example Cases
Actionable Intelligence
Additional Resources &
Brief Q&A
Automatic Remediation Case #1
Are hard disk errors actually legitimate problems?
Use the “Run Script” option first:
Note: For automatic remediation to take place, the machine must be online to
use “Run Script,” so do not use this for “Agent Offline” alerts 
Don’t get SMART with me!
I will pull this car over right now!
S.M.A.R.T. = Self-Monitoring, Analysis and Reporting
Technology; often written as SMART, is a monitoring system
for computer hard disk drives to detect and report on
various indicators of reliability, in the hope of anticipating
failures. Reference: http://en.wikipedia.org/wiki/S.M.A.R.T.
Automatic Remediation Case #1
Are hard disk errors actually legitimate problems?
Is the disk fixed or removable?
• We don’t care about Jim Bob’s iPod…
Is SMART enabled on the disk?
• If not, can we enable it?
How’s the disk SMART health?
• If it’s “PASSED,” ignore the Event Log!
Use The Variables, Luke!
A true Jedi will always RTFM…
Event Log Alerts populate variables when a
particular Event is encountered
http://help.kaseya.com/WebHelp/EN/VSA/6030000/index.htm#4853.htm
Monitor Alarms populate variables when
the Counter/Service/Process goes “out of
acceptable operating range”
http://help.kaseya.com/WebHelp/EN/VSA/6030000/index.htm#1936.htm
Using Event Log Alert Variables
The data is there… let’s use it!
Within an Email Within a Procedure Description
<at> #at# alert time
<cg> #cg# event category
<cn> #cn# computer name
<db-view.column> not available Include a view.column from the database. For
example, to include the computer name of the
machine generating the alert in an email, use <db-
vMachine.ComputerName>
<ed> #ed# event description
<ei> #ei# event id
<es> #es# event source
<et> #et# event time
<eu> #eu# event user
<ev> #ev# event set name
<gr> #gr# group ID
<id> #id# machine ID
<lt> #lt# log type (Application, Security, System)
<tp> #tp# event type - (Error, Warning, Informational, Success
Audit, or Failure Audit)
Note: #subject# and #body# are also available but wouldn’t fit on this slide
Using Monitor Alarm Variables
The data is there… let’s use it!
Within an Email Within a Procedure Description
<ad> #ad# alarm duration
<ao> #ao# alarm operator
<at> #at# alert time
<av> #av# alarm threshold
<cg> #cg# event category
<db-view.column> not available Include a view.column from the database. For
example, to include the computer name of the
machine generating the alert in an email, use <db-
vMachine.ComputerName>
<dv> #dv# SNMP device name
<gr> #gr# group ID
<id> #id# machine ID
<ln> #ln# monitoring log object name
<lo> #lo# monitoring log object type: counter, process, object
<lv> #lv# monitoring log value
<mn> #mn# monitor set name
not available #subject# subject text of the email message, if an email was
sent in response to an alert
not available #body# body text of the email message, if an email was sent
in response to an alert
Automatic Remediation Case #1
Are hard disk errors actually legitimate problems?
• Make an Event Set that catches disk
events only (excluding tape errors)
• Set the Re-Arm time to an hour (recommended)
Phase Phase Phase1 2 3
Rebuild
Event
Sets
Profit
Automatic Remediation Case #1
Are hard disk errors actually legitimate problems?
Automatic Remediation Case #1
Are hard disk errors actually legitimate problems?
Let’s make some Agent Procedures!
• Use a “parent” procedure that calls
“child” procedures to better pinpoint
exceptions & iterate through all drives
• Use the Agent Procedure Log to log key
data points for later troubleshooting
• Perform output validation to confirm all
commands and third-party utilities being
employed produce expected output
Automatic Remediation Case #1
Assembling your utility belt
• SMART Health Checker:
– Checks SMART health of a fixed disk
– http://smartmontools.sourceforge.net
• Head.exe & Tail.exe
– Text & file manipulation
– http://unxutils.sourceforge.net/
• Built-in Windows utilities like DiskPart:
Automatic Remediation Case #1
Best Practices - Re-using “generic” procedures
Initialization script:
Automatic Remediation Case #1
Are hard disk errors actually legitimate problems?
What can we “key off of” in these events?
Automatic Remediation Case #1
The ciiiiircle of liiiife…
Errors inconsistently identify the drive, so
we’ll check all fixed disks in a loop:
Can’t enable
SMART? Drive
failure imminent?
Fire a ticket!
Unexpected
return value?
Fire a ticket!
Unexpected
return value?
Fire a ticket!
More than 10 disks?
Fire a ticket!
Removable
disk? Skip
to the next
one…
Query disk #__
for type
Interpret
results of query
Fixed disk?
Check SMART
Interpret
SMART results
Increment disk
counter
Automatic Remediation Case #1
Oppan Agent Procedure Style!
Automatic Remediation Case #1
1) Turn local Event Log Alert variables into
Global variables:
2) Fire the “Initialization” Agent Procedure
3) Fire the “Copy Disk Utilities To Agent”
Agent Procedure
Automatic Remediation Case #1
Best Practices – Re-Using Common Agent Procedures
The “Initialization” Agent Procedure:
• Defines alert e-mail addresses:
• Copies over a common suite of utilities:
• Initializes the command output files:
Automatic Remediation Case #1
1) Defines abbreviated paths to the
DiskPart script files:
2) Copies the DiskPart script files,
smartctl.exe (x86 or x64) and
diskpart.exe (if Win2K) to the machine
Automatic Remediation Case #1
1) Uses DiskPart.exe to request a list of
physical disks on the machine:
2) Starts the Loop that checks disks 0-9 to
see if they’re present in above output:
Automatic Remediation Case #1
1) If there are 10+ physical
disks, procedure will alert the
global:monitoringAlertEMailAddress:
2) Fires the next Agent Procedure which
configures the query variables:
Automatic Remediation Case #1
1) Sets the disk count,
smartctl.exe query letter (uses letters
instead of disk numbers, no idea why), and
path to the DiskPart script for that disk:
Automatic Remediation Case #1
1) Checks the disk type
and proceeds with the next Agent
Procedure only if it’s not a USB disk
2) If it is a USB disk, we brag about having
prevented a Superfluous Ticket:
Automatic Remediation Case #1
Automatic Remediation Case #1
1) Checks if SMART is enabled
a. Attempts to enable it if not (next slide)
2) Formulates the SMART health check:
3) Runs the SMART health check command
and calls the next Agent Procedure to
interpret the output:
Automatic Remediation Case #1
1) Attempts to enable SMART using
smartctl.exe:
2) Failure to enable SMART is accounted in
the next Agent Procedure
Automatic Remediation Case #1
1) Interprets the output of
smartctl.exe and:
a. …alerts if SMART can’t be enabled on a disk
with issues (next slide) –or –
b. …boasts if SMART reports that the drive has
a “healthy” SMART status (following slide)
2) Accounts for potential failure of
smartctl.exe by sending failures to
global:monitoringAlertEMailAddress
Automatic Remediation Case #1
Automatic Remediation Case #1
Automatic Remediation Case #1
Are hard disk errors actually legitimate problems?
• If SMART can be enabled, is the disk
healthy? Or is failure imminent?
Are We There Yet?
Who’s That Guy?
Eliminating Superfluous Tickets
Automatic Remediation Building
Blocks & Example Cases
Actionable Intelligence
Additional Resources &
Brief Q&A
Actionable Intelligence
WWYAFLSDEDWTT?
What Would Your Average Front-Line
Service Desk Engineer Do With This Ticket?
Eliminating Superfluous Tickets
Rebuilding Event Sets – Important Questions
Are all hard disk
errors actually
legitimate
problems?
Does that stopped
Service need actual
Human intervention
or just a swift kick?
Is the disk that’s low on drive space an
iPod, iPad or pagefile-only volume?
Do I need to run an
“Update Lists By Scan”
on this server to catch
recently removed or
deactivated Services?
What is the answer to the Ultimate Question
of Life, The Universe and Everything?
Has the client told
us to not monitor
something?
Do it now!
Seriously… Are We There Yet?
Who’s That Guy?
Eliminating Superfluous Tickets
Automatic Remediation Building
Blocks & Example Cases
Actionable Intelligence
Additional Resources &
Brief Q&A
Using ALARM/Event Variables
The data is there… let’s use it!
Reference:
http://help.kaseya.com/WebHelp/EN/VSA/6030000/index.htm#1936.htm
http://help.kaseya.com/WebHelp/EN/VSA/6030000/index.htm#4853.htm
All presentation materials are available at:
There is a theory which states that
if ever anyone discovers exactly
what the Universe is for and why it
is here, it will instantly disappear
and be replaced by something even
more bizarre and inexplicable.
There is another theory, which
states that this has already
happened.

More Related Content

Similar to Kaseya Connect 2013: Automatic Remediation & Superfluous Ticket Elimination

Computer Archeticture
Computer ArchetictureComputer Archeticture
Computer Archeticture
mahmoud
 

Similar to Kaseya Connect 2013: Automatic Remediation & Superfluous Ticket Elimination (20)

Scan tool basics
Scan tool basicsScan tool basics
Scan tool basics
 
Pc dianosing
Pc dianosingPc dianosing
Pc dianosing
 
Do you have an "analytics"? How analytics tools work
Do you have an "analytics"? How analytics tools workDo you have an "analytics"? How analytics tools work
Do you have an "analytics"? How analytics tools work
 
Integris Security - Hacking With Glue ℠
Integris Security - Hacking With Glue ℠Integris Security - Hacking With Glue ℠
Integris Security - Hacking With Glue ℠
 
Get Rid Of Smart Data Recovery In Few Simple Automatic Steps
Get Rid Of  Smart Data Recovery In Few Simple Automatic StepsGet Rid Of  Smart Data Recovery In Few Simple Automatic Steps
Get Rid Of Smart Data Recovery In Few Simple Automatic Steps
 
Questionaire
QuestionaireQuestionaire
Questionaire
 
Advanced malware analysis training session3 botnet analysis part2
Advanced malware analysis training session3 botnet analysis part2Advanced malware analysis training session3 botnet analysis part2
Advanced malware analysis training session3 botnet analysis part2
 
PCD - Process control daemon - Presentation
PCD - Process control daemon - PresentationPCD - Process control daemon - Presentation
PCD - Process control daemon - Presentation
 
How could I automate log gathering in the distributed system
How could I automate log gathering in the distributed systemHow could I automate log gathering in the distributed system
How could I automate log gathering in the distributed system
 
Monitoring &amp; alerting presentation sabin&amp;mustafa
Monitoring &amp; alerting presentation sabin&amp;mustafaMonitoring &amp; alerting presentation sabin&amp;mustafa
Monitoring &amp; alerting presentation sabin&amp;mustafa
 
Pretty pictures - Brandon Satrom
Pretty pictures - Brandon SatromPretty pictures - Brandon Satrom
Pretty pictures - Brandon Satrom
 
Computer Archeticture
Computer ArchetictureComputer Archeticture
Computer Archeticture
 
SplunkLive! Paris 2018: Event Management Is Dead
SplunkLive! Paris 2018: Event Management Is DeadSplunkLive! Paris 2018: Event Management Is Dead
SplunkLive! Paris 2018: Event Management Is Dead
 
Cortana Analytics Workshop: Predictive Maintenance in the IoT Era
Cortana Analytics Workshop: Predictive Maintenance in the IoT EraCortana Analytics Workshop: Predictive Maintenance in the IoT Era
Cortana Analytics Workshop: Predictive Maintenance in the IoT Era
 
CCleaner and case studies in Cyber Security
CCleaner and case studies in Cyber SecurityCCleaner and case studies in Cyber Security
CCleaner and case studies in Cyber Security
 
Pace IT - Troubleshooting OS part 1
Pace IT - Troubleshooting OS part 1Pace IT - Troubleshooting OS part 1
Pace IT - Troubleshooting OS part 1
 
Application Logging Good Bad Ugly ... Beautiful?
Application Logging Good Bad Ugly ... Beautiful?Application Logging Good Bad Ugly ... Beautiful?
Application Logging Good Bad Ugly ... Beautiful?
 
Tips for The Accidental Techie
Tips for The Accidental TechieTips for The Accidental Techie
Tips for The Accidental Techie
 
Tips For The At Web
Tips For The At WebTips For The At Web
Tips For The At Web
 
LO6.pptx diagnostic software in computer
LO6.pptx diagnostic software in computerLO6.pptx diagnostic software in computer
LO6.pptx diagnostic software in computer
 

More from Kaseya

More from Kaseya (20)

Kaseya Kaspersky Breaches
Kaseya Kaspersky BreachesKaseya Kaspersky Breaches
Kaseya Kaspersky Breaches
 
Enterprise Mobility Management I: What's Next for Management (MDM)
Enterprise Mobility Management I: What's Next for Management (MDM)Enterprise Mobility Management I: What's Next for Management (MDM)
Enterprise Mobility Management I: What's Next for Management (MDM)
 
Enterprise Mobility Management II: BYOD Tips, Tricks and Techniques
Enterprise Mobility Management II: BYOD Tips, Tricks and TechniquesEnterprise Mobility Management II: BYOD Tips, Tricks and Techniques
Enterprise Mobility Management II: BYOD Tips, Tricks and Techniques
 
Remote Control Architecture: How We Are Building The World’s Fastest Remote C...
Remote Control Architecture: How We Are Building The World’s Fastest Remote C...Remote Control Architecture: How We Are Building The World’s Fastest Remote C...
Remote Control Architecture: How We Are Building The World’s Fastest Remote C...
 
Reporting and Dashboards: The Present and Future Direction of VSA Reporting
Reporting and Dashboards: The Present and Future Direction of VSA ReportingReporting and Dashboards: The Present and Future Direction of VSA Reporting
Reporting and Dashboards: The Present and Future Direction of VSA Reporting
 
365 Command: Managing SharePoint in Office 365
365 Command: Managing SharePoint in Office 365365 Command: Managing SharePoint in Office 365
365 Command: Managing SharePoint in Office 365
 
365 Command: Managing Exchange in Office 365
365 Command: Managing Exchange in Office 365365 Command: Managing Exchange in Office 365
365 Command: Managing Exchange in Office 365
 
Advanced Administration: Mobile Device Management
Advanced Administration: Mobile Device ManagementAdvanced Administration: Mobile Device Management
Advanced Administration: Mobile Device Management
 
Advanced Administration: Kaseya Traverse
Advanced Administration: Kaseya TraverseAdvanced Administration: Kaseya Traverse
Advanced Administration: Kaseya Traverse
 
Advanced Administration: Kaseya BYOD Suite
Advanced Administration: Kaseya BYOD SuiteAdvanced Administration: Kaseya BYOD Suite
Advanced Administration: Kaseya BYOD Suite
 
The MSP of the Future: Key Trends and Opportunities for Growing Your Revenue ...
The MSP of the Future: Key Trends and Opportunities for Growing Your Revenue ...The MSP of the Future: Key Trends and Opportunities for Growing Your Revenue ...
The MSP of the Future: Key Trends and Opportunities for Growing Your Revenue ...
 
Boost Your Managed Services and Profits by Adding Disaster Recovery to any Ac...
Boost Your Managed Services and Profits by Adding Disaster Recovery to any Ac...Boost Your Managed Services and Profits by Adding Disaster Recovery to any Ac...
Boost Your Managed Services and Profits by Adding Disaster Recovery to any Ac...
 
Security and Backup II: Vision and Direction
Security and Backup II: Vision and DirectionSecurity and Backup II: Vision and Direction
Security and Backup II: Vision and Direction
 
Security and Backup I: OEM Architecture
Security and Backup I: OEM ArchitectureSecurity and Backup I: OEM Architecture
Security and Backup I: OEM Architecture
 
Kaseya Monitoring Suite Overview
Kaseya Monitoring Suite OverviewKaseya Monitoring Suite Overview
Kaseya Monitoring Suite Overview
 
Kaseya Asset Discovery Overview
Kaseya Asset Discovery OverviewKaseya Asset Discovery Overview
Kaseya Asset Discovery Overview
 
Automation Desk II: Policy-Driven Automation and a Glimpse into the Future of...
Automation Desk II: Policy-Driven Automation and a Glimpse into the Future of...Automation Desk II: Policy-Driven Automation and a Glimpse into the Future of...
Automation Desk II: Policy-Driven Automation and a Glimpse into the Future of...
 
Automation Desk I: Leveraging Service Desk as a Hub for Advanced Automation
Automation Desk I: Leveraging Service Desk as a Hub for Advanced AutomationAutomation Desk I: Leveraging Service Desk as a Hub for Advanced Automation
Automation Desk I: Leveraging Service Desk as a Hub for Advanced Automation
 
Kaseya Technology Alliance Partner Landscape
Kaseya Technology Alliance Partner LandscapeKaseya Technology Alliance Partner Landscape
Kaseya Technology Alliance Partner Landscape
 
Advanced Administration: Kaseya Virtual Administrator
Advanced Administration: Kaseya Virtual AdministratorAdvanced Administration: Kaseya Virtual Administrator
Advanced Administration: Kaseya Virtual Administrator
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

Recently uploaded (20)

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 

Kaseya Connect 2013: Automatic Remediation & Superfluous Ticket Elimination

  • 2.
  • 3. Presentation Outline Who’s That Guy? Eliminating Superfluous Tickets Automatic Remediation Building Blocks & Example Cases Actionable Intelligence Additional Resources & Brief Q&A
  • 4. Who’s That Guy? • Brian Dagan from • Based in Chantilly, VA • Approximately 5,400 endpoints
  • 5.
  • 6.
  • 7. Presentation Outline Who’s That Guy? Eliminating Superfluous Tickets Automatic Remediation Building Blocks & Example Cases Actionable Intelligence Additional Resources & Brief Q&A
  • 8. Superfluous Tickets /so͞oˈpər-fləəs/ • More than is sufficient or required • Unnecessary or needless
  • 11. $45,000/year (Source: http://www.indeed.com) 50 workweeks/year 40 hours/week 2,000 hours/year $22.50/hour 75% efficiency (25% Reddit ) $30/hour 5 minutes/ticket x 189 tickets/day 15.75 hours/day = 2+ Engineers (~5,000 Agents)
  • 12. Eliminating Superfluous Tickets • Rebuilt Event Sets, Monitor Sets • Implemented Policy Management • Wrote Automatic Remediation Agent Procedures for: – Hard disk errors – Low drive space – Service stoppages – Anti-virus removal & installation – Inheritance-based Policy application
  • 13. Eliminating Superfluous Tickets Rebuilding Event Sets – Best Practices • Name your Event Sets for the severity – Systems Management Pack does this:
  • 14. Eliminating Superfluous Tickets Rebuilding Event Sets – Best Practices • Use EID, Source & Description if possible – Systems Management Pack does this:
  • 15. Eliminating Superfluous Tickets Rebuilding Event Sets – Best Practices • Use EID, Source & Description if possible – Systems Management Pack also… doesn’t: – Be aware of why this is a problem!
  • 16. Eliminating Superfluous Tickets Rebuilding Event Sets – Best Practices • What’s worth waking up an Engineer? – “Critical” priority • What’s going to ruin someone’s day? – “High” priority • What needs addressed in a day or so? – “Monitoring” priority • What’s going to get you sued? – “Auditing” priority
  • 17. Eliminating Superfluous Tickets Rebuilding Event Sets – Best Practices • Removing superfluous Events: – Which ticket queue was the Alert in? • Locate the Event Set… – What was the Event ID? • Locate the exact Event in the Event Set… – What’s *unique* about the Alert • Modify the Event (or add another Event with specific info to match the superfluous alert ticket, and set it to Ignore):
  • 18. Eliminating Superfluous Tickets Are you pondering what I’m pondering, Pinky? • Before you add an Event… think! Can I do something more intelligent with this alert?
  • 19. Presentation Outline Who’s That Guy? Eliminating Superfluous Tickets Automatic Remediation Building Blocks & Example Cases Actionable Intelligence Additional Resources & Brief Q&A
  • 20. Automatic Remediation Case #1 Are hard disk errors actually legitimate problems? Use the “Run Script” option first: Note: For automatic remediation to take place, the machine must be online to use “Run Script,” so do not use this for “Agent Offline” alerts 
  • 21.
  • 22. Don’t get SMART with me! I will pull this car over right now! S.M.A.R.T. = Self-Monitoring, Analysis and Reporting Technology; often written as SMART, is a monitoring system for computer hard disk drives to detect and report on various indicators of reliability, in the hope of anticipating failures. Reference: http://en.wikipedia.org/wiki/S.M.A.R.T.
  • 23. Automatic Remediation Case #1 Are hard disk errors actually legitimate problems? Is the disk fixed or removable? • We don’t care about Jim Bob’s iPod… Is SMART enabled on the disk? • If not, can we enable it? How’s the disk SMART health? • If it’s “PASSED,” ignore the Event Log!
  • 24. Use The Variables, Luke! A true Jedi will always RTFM… Event Log Alerts populate variables when a particular Event is encountered http://help.kaseya.com/WebHelp/EN/VSA/6030000/index.htm#4853.htm Monitor Alarms populate variables when the Counter/Service/Process goes “out of acceptable operating range” http://help.kaseya.com/WebHelp/EN/VSA/6030000/index.htm#1936.htm
  • 25. Using Event Log Alert Variables The data is there… let’s use it! Within an Email Within a Procedure Description <at> #at# alert time <cg> #cg# event category <cn> #cn# computer name <db-view.column> not available Include a view.column from the database. For example, to include the computer name of the machine generating the alert in an email, use <db- vMachine.ComputerName> <ed> #ed# event description <ei> #ei# event id <es> #es# event source <et> #et# event time <eu> #eu# event user <ev> #ev# event set name <gr> #gr# group ID <id> #id# machine ID <lt> #lt# log type (Application, Security, System) <tp> #tp# event type - (Error, Warning, Informational, Success Audit, or Failure Audit) Note: #subject# and #body# are also available but wouldn’t fit on this slide
  • 26. Using Monitor Alarm Variables The data is there… let’s use it! Within an Email Within a Procedure Description <ad> #ad# alarm duration <ao> #ao# alarm operator <at> #at# alert time <av> #av# alarm threshold <cg> #cg# event category <db-view.column> not available Include a view.column from the database. For example, to include the computer name of the machine generating the alert in an email, use <db- vMachine.ComputerName> <dv> #dv# SNMP device name <gr> #gr# group ID <id> #id# machine ID <ln> #ln# monitoring log object name <lo> #lo# monitoring log object type: counter, process, object <lv> #lv# monitoring log value <mn> #mn# monitor set name not available #subject# subject text of the email message, if an email was sent in response to an alert not available #body# body text of the email message, if an email was sent in response to an alert
  • 27. Automatic Remediation Case #1 Are hard disk errors actually legitimate problems? • Make an Event Set that catches disk events only (excluding tape errors) • Set the Re-Arm time to an hour (recommended)
  • 28. Phase Phase Phase1 2 3 Rebuild Event Sets Profit
  • 29. Automatic Remediation Case #1 Are hard disk errors actually legitimate problems?
  • 30. Automatic Remediation Case #1 Are hard disk errors actually legitimate problems? Let’s make some Agent Procedures! • Use a “parent” procedure that calls “child” procedures to better pinpoint exceptions & iterate through all drives • Use the Agent Procedure Log to log key data points for later troubleshooting • Perform output validation to confirm all commands and third-party utilities being employed produce expected output
  • 31. Automatic Remediation Case #1 Assembling your utility belt • SMART Health Checker: – Checks SMART health of a fixed disk – http://smartmontools.sourceforge.net • Head.exe & Tail.exe – Text & file manipulation – http://unxutils.sourceforge.net/ • Built-in Windows utilities like DiskPart:
  • 32. Automatic Remediation Case #1 Best Practices - Re-using “generic” procedures Initialization script:
  • 33. Automatic Remediation Case #1 Are hard disk errors actually legitimate problems? What can we “key off of” in these events?
  • 34. Automatic Remediation Case #1 The ciiiiircle of liiiife… Errors inconsistently identify the drive, so we’ll check all fixed disks in a loop: Can’t enable SMART? Drive failure imminent? Fire a ticket! Unexpected return value? Fire a ticket! Unexpected return value? Fire a ticket! More than 10 disks? Fire a ticket! Removable disk? Skip to the next one… Query disk #__ for type Interpret results of query Fixed disk? Check SMART Interpret SMART results Increment disk counter
  • 35. Automatic Remediation Case #1 Oppan Agent Procedure Style!
  • 36. Automatic Remediation Case #1 1) Turn local Event Log Alert variables into Global variables: 2) Fire the “Initialization” Agent Procedure 3) Fire the “Copy Disk Utilities To Agent” Agent Procedure
  • 37. Automatic Remediation Case #1 Best Practices – Re-Using Common Agent Procedures The “Initialization” Agent Procedure: • Defines alert e-mail addresses: • Copies over a common suite of utilities: • Initializes the command output files:
  • 38. Automatic Remediation Case #1 1) Defines abbreviated paths to the DiskPart script files: 2) Copies the DiskPart script files, smartctl.exe (x86 or x64) and diskpart.exe (if Win2K) to the machine
  • 39. Automatic Remediation Case #1 1) Uses DiskPart.exe to request a list of physical disks on the machine: 2) Starts the Loop that checks disks 0-9 to see if they’re present in above output:
  • 40. Automatic Remediation Case #1 1) If there are 10+ physical disks, procedure will alert the global:monitoringAlertEMailAddress: 2) Fires the next Agent Procedure which configures the query variables:
  • 41. Automatic Remediation Case #1 1) Sets the disk count, smartctl.exe query letter (uses letters instead of disk numbers, no idea why), and path to the DiskPart script for that disk:
  • 42. Automatic Remediation Case #1 1) Checks the disk type and proceeds with the next Agent Procedure only if it’s not a USB disk 2) If it is a USB disk, we brag about having prevented a Superfluous Ticket:
  • 44. Automatic Remediation Case #1 1) Checks if SMART is enabled a. Attempts to enable it if not (next slide) 2) Formulates the SMART health check: 3) Runs the SMART health check command and calls the next Agent Procedure to interpret the output:
  • 45. Automatic Remediation Case #1 1) Attempts to enable SMART using smartctl.exe: 2) Failure to enable SMART is accounted in the next Agent Procedure
  • 46. Automatic Remediation Case #1 1) Interprets the output of smartctl.exe and: a. …alerts if SMART can’t be enabled on a disk with issues (next slide) –or – b. …boasts if SMART reports that the drive has a “healthy” SMART status (following slide) 2) Accounts for potential failure of smartctl.exe by sending failures to global:monitoringAlertEMailAddress
  • 49. Automatic Remediation Case #1 Are hard disk errors actually legitimate problems? • If SMART can be enabled, is the disk healthy? Or is failure imminent?
  • 50. Are We There Yet? Who’s That Guy? Eliminating Superfluous Tickets Automatic Remediation Building Blocks & Example Cases Actionable Intelligence Additional Resources & Brief Q&A
  • 51. Actionable Intelligence WWYAFLSDEDWTT? What Would Your Average Front-Line Service Desk Engineer Do With This Ticket?
  • 52.
  • 53.
  • 54. Eliminating Superfluous Tickets Rebuilding Event Sets – Important Questions Are all hard disk errors actually legitimate problems? Does that stopped Service need actual Human intervention or just a swift kick? Is the disk that’s low on drive space an iPod, iPad or pagefile-only volume? Do I need to run an “Update Lists By Scan” on this server to catch recently removed or deactivated Services? What is the answer to the Ultimate Question of Life, The Universe and Everything? Has the client told us to not monitor something?
  • 55.
  • 56.
  • 58. Seriously… Are We There Yet? Who’s That Guy? Eliminating Superfluous Tickets Automatic Remediation Building Blocks & Example Cases Actionable Intelligence Additional Resources & Brief Q&A
  • 59.
  • 60. Using ALARM/Event Variables The data is there… let’s use it! Reference: http://help.kaseya.com/WebHelp/EN/VSA/6030000/index.htm#1936.htm http://help.kaseya.com/WebHelp/EN/VSA/6030000/index.htm#4853.htm All presentation materials are available at:
  • 61. There is a theory which states that if ever anyone discovers exactly what the Universe is for and why it is here, it will instantly disappear and be replaced by something even more bizarre and inexplicable. There is another theory, which states that this has already happened.

Editor's Notes

  1. http://www.sxc.hu/photo/1383851/?forcedownload=1
  2. http://www.sxc.hu/photo/1383851/?forcedownload=1
  3. Duck is from Microsoft clipart
  4. ScroogeMcDuck’s Vault = 3 square acres of DuckburgAssuming each coin is silver-dollar sized, the vault contains $27 trillion US dollarsDoes not include all of McDuck Industries(image from DeviantArt)
  5. http://fc05.deviantart.net/fs71/f/2012/206/b/e/scrooge_mcduck_by_theblack_kat-d58iogw.png
  6. http://www.sadtrombone.com
  7. http://www.morguefile.com/archive/display/141445
  8. http://en.wikipedia.org/wiki/File:Mazda3-pi.jpg
  9. http://www.morguefile.com/archive/display/723699
  10. http://www.morguefile.com/archive/display/141445
  11. http://www.morguefile.com/archive/display/141445
  12. http://www.morguefile.com/archive/display/141445
  13. http://www.morguefile.com/archive/display/141445
  14. http://www.morguefile.com/archive/display/833942
  15. http://www.eventid.net/display.asp?eventid=11&amp;source=Disk
  16. http://www.thejakartapost.com/news/2012/10/24/psy-speak-oxford-union.html
  17. http://www.thejakartapost.com/news/2012/10/24/psy-speak-oxford-union.html
  18. http://archive.org/details/ADTWhenE1958 (public domain)From 1958
  19. http://www.morguefile.com/archive/display/62058
  20. http://www.morguefile.com/archive/display/71834 (wolf)
  21. http://www.morguefile.com/archive/display/840144
  22. http://www.morguefile.com/archive/display/167655
  23. http://www.morguefile.com/archive/display/2999
  24. http://www.morguefile.com/archive/display/833942http://en.wikipedia.org/wiki/File:SchwarzeneggerJan2010.jpg
  25. http://fc05.deviantart.net/fs71/f/2012/206/b/e/scrooge_mcduck_by_theblack_kat-d58iogw.png
  26. http://fc05.deviantart.net/fs71/f/2012/206/b/e/scrooge_mcduck_by_theblack_kat-d58iogw.pnghttp://en.wikipedia.org/wiki/File:Douglas_adams_portrait_cropped.jpg