SlideShare une entreprise Scribd logo
1  sur  13
Event Management and Monitoring
Program
Strategy
Prepared by: Jim Gingras, Event Management and
Monitoring Manager
Event Managementand Monitoring Program Strategy
Proprietary and Confidential
2
Table of Contents
1. Event Management and Monitoring Strategy.................................................................... 3
1.1 Event Management and Monitoring Overview ............................................................... 3
1.2 Stakeholders ................................................................................................................... 4
1.3 Event Management Program Processes........................................................................ 5
1.3.1 Event Management Process....................................................................................... 6
1.3.2 Event Monitoring ......................................................................................................... 7
1.3.3 Designing Manageable Applications .......................................................................... 8
1.4 Event Management Metrics............................................................................................ 8
1.5 Roadmap......................................................................................................................... 9
1.5.1 Current State to Future State.................................................................................... 10
Appendix A: ABusiness Value proposition for Event Management................................... 12
Event Managementand Monitoring Program Strategy
Proprietary and Confidential
3
1. Event Management and Monitoring Strategy
The strategyfor EventManagementandMonitoringisto take advantage of the existingeventand
monitoringprocessesandtoolsandbuildonthemto propel ITat CORPORATEto the nextlevelof IT
capabilities,demonstratingbusinessvalue throughmanagementandmonitoringof ITservices.
Appendix_A showsanexampleof howEventManagementdemonstratesbusinessvalue.
The strategyinvolvesthe creation of anEventManagementprogramand the associatedprojects that
are executed overthe nexttwoyears.
The remainder of thisdocumentdescribesthe EventManagement andProgramandthe supporting
activitiesrequiredtoensure it isoperatingasdesigned. These include:
 Define EventManagementandMonitoring
 Define the stakeholdersforeventmanagementandmonitoring
 Define the highlevel processesandassociatedactivitiesinthe EventManagementProgram
 Define metricsfordeterminingthe statusof the processes
 Define aroadmapof the actionable andmeasureable projects/initiatives requiredtoestablish
the eventmanagementprogram
EventManagementProgram Definition: The EventManagementProgramisresponsible forthe
managementof the EventManagementprojectsandmonitoringsystemsrequiredtodeterminethe
statusof the servicesITprovides.
EventManagementandMonitoringDefinition:EventManagementandMonitoringisthe processof
managingIT systemandusereventstoprovide the appropriate control actionwhileprovidinganear
real-time viewof the statusof the IT services.
1.1 EventManagementand MonitoringOverview
EventManagement’svalue tothe businessis notdirectinthatit cannotgenerate income forthe
business. The mostrelevantmeasurementstothe businessare:
 DecreasedMeanTime To Repair– decreaseddowntime whenincidents/problemsoccurdue to
the notificationof personnel withthe appropriate skill-level soonerandwiththe correct
informationtoresolve issues,wheneverpossible,before theyoccur.
 IncreasedMeanTime BetweenFailures –analyzingtrendedeventinformation todetermine
upcomingoutagesandremediate thembefore theyoccur(predictivemonitoring)
 Service Level Agreementsare metorexceeded –due todecreaseddowntime
 DecreasedITsupportcost – due to appropriate personnel beingnotified,betteruse of
knowledge fromeventsinthe environment,andfewerpersonnelrequiredtoresolve
incidents/problems
Event Managementand Monitoring Program Strategy
Proprietary and Confidential
4
EventManagementisthe vital hubon whichall processand tool integrationisdeveloped. Event
monitoringencompassesall the activitiesthatare requiredtoensure adevice or ConfigurationItem1
(CI)
isworkingcorrectlyregardlessof whetheritisgeneratingevents.
The foundational elementsforeventmanagement are the systemsandusereventsthatare createdby
CIs or monitoringtools. Inordertoenable monitoringITservicesthese events are mappedtoall the
relatedCIs of a specificITService.Goingforwardaservice view will be availabletoall managementand
service/supportpersonneltoshowthe statusand configurationinformation inaneasyto understand
format.
1.2 Stakeholders
Position Name Description
Event Management Event Management Process Activities
IncidentManagement Automated Incident Management for events
Problem Management Troubleshootingand enhancements for Known Errors
Availability Management Monitoring Requirements
Capacity Management MonitoringRequirements
Operations MonitoringRequirements
Steering Committee Program Management and Reporting
IT Instrumentation MonitoringTools and Reporting
InfrastructureHosting MonitoringRequirements, MonitoringTools
Software Solutions and Support Systems Administration
Architecture Instrumentation of Internal Applicationsand RJSF
design and ServiceModel
Security MonitoringRequirements for Security
Service Management MonitoringRequirements
Product Management Service Model and MonitoringRequirements
Release Management Service Model
Configuration Management Service Model
1 Configuration Items includeservices,applications,or components as per CORPORATE servicemodel in the CMDB
Event Managementand Monitoring Program Strategy
Proprietary and Confidential
5
Service Level Mgmt. Service Model and Service Level Requirements
Software Solutions and Support Internal Software Development, RJSF
Table 1: IT Event Management Stakeholders
The stakeholdersforthe EventManagementProgramare managementandthe processownersforthe
ITIL processesof availability,capacity,incident,problem,andeventmanagement. Additional
stakeholdersincludethe administrative groupswhomustmanage the toolsthatare requiredtodeliver
the eventmanagementservicesanddevelopmanageable applications.Allstakeholdersare requiredto
agree on service viewsthatprovide accurate andrelevantservice statustoservice/supportpersonnelin
supportof the business.
1.3 EventManagementProgramProcesses
The Event Managementprogramisresponsibleforthe EventManagementprocessandforthe direction
of the EventMonitoringenvironment. Italsointegrateswiththe ITILService Designprocessesof
Availability,CapacityandSecurityManagementformonitoringrequirementsandcapabilities,andinthe
IncidentandProblemManagementprocessesasinputsandoutputsforautomatedremediation or
notificationactivitiesbasedon significantevents.EventManagementplaysasignificantrole inthe
ContinuousServiceImprovementprocessesasa pointof research,auditandverification.
ConsiderationsforEventManagement are alsorequiredaspartof applicationdevelopmentprocesses
(e.g.RJSF),startingwithapplicationdesignand development.Additionally,the creationand
managementof service basedviewsenablesthe nextgenerationof eventmonitoringforservice status
events.
Event Managementand Monitoring Program Strategy
Proprietary and Confidential
6
1.3.1 Event Management Process
The Event Managementprocessisthe processthatmonitorsall eventsthatoccur throughthe IT
Infrastructure toallowfornormal operationandalsoto detectandescalate exceptionconditions.
Figure 1: ITIL V3 Event Management Process
The figure showsthatthe eventmanagementprocessisresponsible fordetection,filtering,triggering,
alerting,automatedresponse andreviewingactions. The triggersandautomatedresponse will control
the scope of the workrequiredbythe eventmanagementprocess. Inotherwords,the more triggers
and automatedresponsesthatare required,the more workmustbe accomplishedtoautomate the
response andincrease the businessvalue.
One of the keystoa successful EventManagementprogramistodefine whichactionstriggerthe event
managementprocessandmanagingthe numberandpriorityof those events. Triggersinclude:
 Exceptionstoanylevel of ConfigurationItem(CI) performancedefinedindesignspecifications,
SLAs,OLAs andSOPs
 Exceptionstoan automatedprocedure orprocess – monitoringanautomatedworkflow
 ExceptionwithinaITprocessthat isbeingmonitored –(e.g.serverbuild)
 The completionof anautomatedtaskor job
Event Managementand Monitoring Program Strategy
Proprietary and Confidential
7
 A statuschange in a device ordatabase record dependingonthe granularityof the monitoring
requirements
 Accessof an applicationordatabase bya useror automatedprocedure orjob
 A situationwhere adevice,database,orapplication,orservice hasreachedapre-defined
performance threshold.
For thecurrent statethe mostimportantaspectof the EventManagementprocessisthat all typesof
alerts will result in an incident being opened in the Service Desk.
1.3.2 Event Monitoring
EventMonitoringcoversa broadspectrumof all the monitoringcapabilitiesacrossthe CORPORATEIT
enterprise.The EventManagementarchitecture deployedtodayusesaManagerOf Managers(MOM) to
gathereventsfromall the IT ManagementDomains. The majorIT Domainsare Application,Database,
End User,Facilities,Network,Security,ServerPlatform(whichincludesvirtual),Storage,Telephony,and
Workload.
Figure 2: Manager of Managers Architecture
Althoughall eventsare monitored,onlysignificanteventsare managedbecause theyare meaningful.
Thisis accomplishedthroughfilteringatthe ITDomainlevel toidentifyeventsthatare recognizedas
affectingthe statusof ConfigurationItems(CI) (i.e.Service,Application, andComponent),automation
processesorothersignificantoccurrence. The Managerof Managers thencorrelatesthe eventsfrom
each of the IT Domainsdeterminesthe course of actionandexecutesanautomatedresponse. Forall
significanteventsanincidentwill automaticallybe opened,assignedandprioritizedinthe Service Desk.
A majorportionof the EventManagementProgramincludes creatinginterfacesthatenable monitoring
at the serviceslevel. The bestapproachisto start witha few significantservices todemonstrate the
businessvalue of monitoringservices.
Event Managementand Monitoring Program Strategy
Proprietary and Confidential
8
1.3.3 Designing Manageable Applications
In orderto optimize operational managementof applicationsthe applicationsmustbe designedwith
operationsinmind. Thisrequires thatmonitoringrequirements are identifiedduringthe application
designphase of the applicationlifecycle andinstrumentedduringthe applicationdevelopmentcycle.
One of the keydeliverablesthatenablesthistype of monitoringisthe “healthmodel”whichrelatesthe
statusof individual componentstothe statusof the overall applicationorservice. Forinternally
developedapplications CORPORATEhasembraced the use of managementpacksasa meansof ensuring
the supportabilityof applications. Thisinitiative isinline withMicrosoft’sDesignforOperations
methodology.
1.4 EventManagementMetrics
Once EventManagementisin place a baseline mustbe establishedastothe currentperformance levels
and value tothe organizationintermsof optimizingoperationsactivitiesandMeanTime To Repair. The
followingmetricsare recommendedbyITILv3:
1. Numberof eventspercategory – IT Domain,Service,Application
2. Numberof eventsbysignificance –Exception(Critical orMajor),Warning(Minor),or
Informational (non-exception/warningapplicationmessages)
3. Numberandpercentage of eventsthatrequiredhumaninterventionandwhetherthiswas
performed –incidentsare notopened
4. Numberandpercentage of eventsthatresulted inincidentsorchanges
5. Numberandpercentage of eventscausedbyexistingproblemsorKnownErrors
6. Numberandpercentage of replicatedorduplicatedevents
7. Numberandpercentage of eventsindicating performance issues
8. Numberandpercentage of eventsindicatingpotentialavailabilityissues
9. Numberandpercentage of eachtype of eventperplatformor application
10. Numberandratio of eventscomparedwiththe numberof incidents
Furtherresearchmustbe done to determine how toderive thesemetricsandassociatedreportswith
the existingmonitoringtools. Service Deskandthe Managerof Managers are goodplacesto beginthis
work. These metricswill enablethe “tuning”of the eventmanagementsystemthroughthe adjustment
of the filtersandcorrelationengine inthe domainmanagersandManagerof Managers,respectively.
Event Managementand Monitoring Program Strategy
Proprietary and Confidential
9
1.5 Roadmap
Figure 3: Event Management Program Roadmap
The highlevel roadmapforthe IT EventManagementProgram has eightprojects:
1. Define the EventManagement strategyand programincludingdeliverables:
a. EventManagementStrategy
b. EventManagementProcess
c. EventHandlingPoliciesandStandards
i. Notification/EscalationpoliciesandStandards
d. EventManagementprojects/initiatives
e. Eventmanagementprogramroadmap
2. Establish EventManagementProgramthrough:
a. Ratificationof the eventmanagementandmonitoringprocessesandactivities
i. Ratificationof eventhandlingpoliciesandstandards
b. Communicate andgathersupportforeventmanagementprogramactivitiesin
collaborationwithstakeholders toagree ondeliverables
c. Establishatimeline forcompletingthe workactivities anddeliverables
3. Integrationwith ITILotherITIL managementprocessesincluding:
a. IncidentManagementforautomation of incidentmanagementprocessactivitieswhere
applicable.
i. Automaticallymanage incidentsfromuserevents(transactions)
Event Managementand Monitoring Program Strategy
Proprietary and Confidential
10
ii. Automaticallymanage incidentsfromsystemevents
iii. Automaticallymanage incidentsfrom service events
b. AvailabilityManagement–foravailabilitymonitoringrequirementsof service
components
c. CapacityManagement – for the capacitymonitoringrequirementsof the service
components
d. ProblemManagement –for eventinformationinthe KnownErrorDatabase and for
verificationandauditof the rootcause of problems.
4. Integrationwith applicationdesignanddevelopment –for internallydevelopedapplications
throughIT architecture andSoftware Engineering
a. Adoptionof managementpacksformonitoringapplications
i. MicrosoftManagementPacksfor internallydevelopedapplicationsonthe
Windowsplatform
b. Propagationof configurationandstatusinformationtoservice views basedonthe
service andhealthmodels.
5. Integrationwiththird-partyapplications
a. Adoptionof managementpack methodologyformanagementof eventsfromthird-
party applications
i. Create deliverablesthatare platformdependent
ii. Coordinate withinstrumentationandsystemsupportforinstrumentation
lifecycle (design,develop,test,deploy)
6. Consolidation/Correlation of Domainlevel events
a. Completionof integrationof critical,majorandminoreventsacrossall ITDomains to
the Manager of Managers.
b. Implementationof correlationpolicies/rulestoforwardsignificanteventsforincidents
and alerts.
7. Integrationwiththe ITService/Supportgroupsthroughthe creationandmanagementof service
viewsandrelatedconfigurationitems.
a. Role basedservice dashboardsforusergroups
8. Continuousprocessimprovement
a. Auditandverifyqualityandefficiencyof existingeventmanagement andmonitoring
systemsandadjustfiltersandcorrelationenginestostreamlineautomation.
1.5.1 Current State to Future State
EventManagementandMonitoringhas beeninplace foryearsat CORPORATE. It has maturedto a level
where eventsare triggeringworkloadandotherautomation/remediation,aswell as,automated
notification/escalation. Asfaras a maturitylevel, CORPORATEisbetweenreactive andproactive. There
are specificcaseswhere we are atthe predictive level (monitoringbatch),butthisisthe exception.
There are manymanagement/monitoringtoolsinplace acrossall the IT Domains. The two majortasks
that mustbe accomplishedinorderforthe EventManagementprogramtobe successful are:
Event Managementand Monitoring Program Strategy
Proprietary and Confidential
11
 Mature IT monitoringfromareactive/proactivelevel toaproactive/predictivematuritylevel
throughautomationof eventresponses forall significantevents.
 Consolidate andcorrelate all the eventsintomeaningful statusinformationfor CIslike
applications,systems andITservices.
The biggestenablergoingforwardisthe use of ITIL v3 as the frameworkformanagingIT. Thisprovides
a commonvernacularand helpsestablish acceptedgovernance processesforeventmanagementand
monitoring. Use of a frameworkcombinedwiththe use of the servicesconstructtorepresentITvalue
to the business provides anewlevel of eventmanagementand monitoringforCORPORATE.
Event Managementand Monitoring Program Strategy
Proprietary and Confidential
12
Appendix A: A Business Value proposition for Event
Management2
In simple termseventmanagementenablesreal timemonitoringof the infrastructure (i.e.listeningfor
thingsthat are wrong),anduseseventcorrelationtofilter,de-duplicate andcombine eventstodetect
more seriousissues. EventManagementisimportantbecause itwill:
 Improve time toresolve throughcause identification
 Improve visibilitytoreal time
 Enable proactive managementof impacttothe business(ITcallsthe business)
 Improve SecurityManagement
Studiesshowthatfaultdetectionandroot-cause analysis are the mostimportantsystems management
capabilities. Studiesalsoshowthatthe mosttime-consumingsystems managementtasksare diagnosis
and troubleshooting. EventManagementenablesproactive responsestoeventsandenablesautomatic
trackingand resolutionformost systemevents. The scenariosbelow show the difference whenevent
managementisimplementedandwhenitisnot3
.
2 Taken from Data Network Event Management and ITIL, CISCO, Keith SInclair
3 The scenarios belowusea network device issueas the example. CORPORATE is monitoringall infrastructure
domains atsome level as described in the Event monitoringsection of this document.
Event Managementand Monitoring Program Strategy
Proprietary and Confidential
13
Figure 4: Scenario Situation normal (w/o Event Management)
Figure 5: Scenario - Situation with Event Management
The bottom line isthatEventmanagementallowsITtoresolve issuesbefore the usersare affected.
Armedwithreportsthatshowthe effectivenessof EventManagement,ITcanshow the businesshow
effectivetheyare anddemonstrate real businessvalue.
Appendix_A_Back

Contenu connexe

Tendances

How To Guide for Event Planning
How To Guide for Event PlanningHow To Guide for Event Planning
How To Guide for Event Planning
kk00700
 
Data Loss Prevention from Symantec
Data Loss Prevention from SymantecData Loss Prevention from Symantec
Data Loss Prevention from Symantec
Arrow ECS UK
 
ITIL SIAM - Service Integration and Management Model
ITIL  SIAM - Service Integration and Management ModelITIL  SIAM - Service Integration and Management Model
ITIL SIAM - Service Integration and Management Model
PeteFeehan
 
The Service Catalog: Cornerstone of Service Management
The Service Catalog: Cornerstone of Service Management The Service Catalog: Cornerstone of Service Management
The Service Catalog: Cornerstone of Service Management
BMC Software
 

Tendances (20)

Event management
Event managementEvent management
Event management
 
Business continuity management per ISO 22301 - a certification training cour...
 Business continuity management per ISO 22301 - a certification training cour... Business continuity management per ISO 22301 - a certification training cour...
Business continuity management per ISO 22301 - a certification training cour...
 
ITSM(IT Service Management)
ITSM(IT Service Management)ITSM(IT Service Management)
ITSM(IT Service Management)
 
SC-900+2022.pdf
SC-900+2022.pdfSC-900+2022.pdf
SC-900+2022.pdf
 
Event Planning
Event PlanningEvent Planning
Event Planning
 
PwC Point of View on Cybersecurity Management
PwC Point of View on Cybersecurity ManagementPwC Point of View on Cybersecurity Management
PwC Point of View on Cybersecurity Management
 
How To Guide for Event Planning
How To Guide for Event PlanningHow To Guide for Event Planning
How To Guide for Event Planning
 
Camunda BPM 7.2 - English
Camunda BPM 7.2 - EnglishCamunda BPM 7.2 - English
Camunda BPM 7.2 - English
 
Microsoft Defender and Azure Sentinel
Microsoft Defender and Azure SentinelMicrosoft Defender and Azure Sentinel
Microsoft Defender and Azure Sentinel
 
ServiceNow Customer Service Management
ServiceNow Customer Service Management ServiceNow Customer Service Management
ServiceNow Customer Service Management
 
Itil v4-mindmap
Itil v4-mindmapItil v4-mindmap
Itil v4-mindmap
 
event-management
event-managementevent-management
event-management
 
Data Loss Prevention from Symantec
Data Loss Prevention from SymantecData Loss Prevention from Symantec
Data Loss Prevention from Symantec
 
ITIL Incident Management Workflow - Process Guide
	 ITIL Incident Management Workflow - Process Guide	 ITIL Incident Management Workflow - Process Guide
ITIL Incident Management Workflow - Process Guide
 
ITIL DevOps and PBR
ITIL DevOps and PBRITIL DevOps and PBR
ITIL DevOps and PBR
 
ITIL and CMMI for service
ITIL and CMMI for serviceITIL and CMMI for service
ITIL and CMMI for service
 
Event Planning
Event PlanningEvent Planning
Event Planning
 
ITIL SIAM - Service Integration and Management Model
ITIL  SIAM - Service Integration and Management ModelITIL  SIAM - Service Integration and Management Model
ITIL SIAM - Service Integration and Management Model
 
The Service Catalog: Cornerstone of Service Management
The Service Catalog: Cornerstone of Service Management The Service Catalog: Cornerstone of Service Management
The Service Catalog: Cornerstone of Service Management
 
The ROI of Problem Management
The ROI of Problem ManagementThe ROI of Problem Management
The ROI of Problem Management
 

En vedette

Buzz Monitoring Strategy
Buzz Monitoring StrategyBuzz Monitoring Strategy
Buzz Monitoring Strategy
David Gracia
 
Proactive End-User Experience Monitoring of Enterprise IT Services
Proactive End-User Experience Monitoring of Enterprise IT ServicesProactive End-User Experience Monitoring of Enterprise IT Services
Proactive End-User Experience Monitoring of Enterprise IT Services
techweb08
 
Continuous monitoring strategy_guide_072712
Continuous monitoring strategy_guide_072712Continuous monitoring strategy_guide_072712
Continuous monitoring strategy_guide_072712
Tuan Phan
 
Ex1-2005 Large NSOC BMC Event Management Deployment
Ex1-2005 Large NSOC BMC Event Management DeploymentEx1-2005 Large NSOC BMC Event Management Deployment
Ex1-2005 Large NSOC BMC Event Management Deployment
Brian Adam
 
Ex2-2010 Large BMC Deployment
Ex2-2010 Large BMC DeploymentEx2-2010 Large BMC Deployment
Ex2-2010 Large BMC Deployment
Brian Adam
 
58466507 event-management-best-practices-1-488
58466507 event-management-best-practices-1-48858466507 event-management-best-practices-1-488
58466507 event-management-best-practices-1-488
Prasad Rt
 

En vedette (20)

IBM Monitoring and Event Management Solutions
IBM Monitoring and Event Management SolutionsIBM Monitoring and Event Management Solutions
IBM Monitoring and Event Management Solutions
 
IoT in the Enterprise: Why Your Monitoring Strategy Should Include Connected ...
IoT in the Enterprise: Why Your Monitoring Strategy Should Include Connected ...IoT in the Enterprise: Why Your Monitoring Strategy Should Include Connected ...
IoT in the Enterprise: Why Your Monitoring Strategy Should Include Connected ...
 
Top 10 Principles of Event Strategy
Top 10 Principles of Event StrategyTop 10 Principles of Event Strategy
Top 10 Principles of Event Strategy
 
Buzz Monitoring Strategy
Buzz Monitoring StrategyBuzz Monitoring Strategy
Buzz Monitoring Strategy
 
Proactive End-User Experience Monitoring of Enterprise IT Services
Proactive End-User Experience Monitoring of Enterprise IT ServicesProactive End-User Experience Monitoring of Enterprise IT Services
Proactive End-User Experience Monitoring of Enterprise IT Services
 
Tango/04 123 Brochure
Tango/04 123 Brochure Tango/04 123 Brochure
Tango/04 123 Brochure
 
IT Service Desk Software RFP Template
IT Service Desk Software RFP TemplateIT Service Desk Software RFP Template
IT Service Desk Software RFP Template
 
Continuous monitoring strategy_guide_072712
Continuous monitoring strategy_guide_072712Continuous monitoring strategy_guide_072712
Continuous monitoring strategy_guide_072712
 
Monitoring the Enterprise: Examples and Best Practices
Monitoring the Enterprise: Examples and Best PracticesMonitoring the Enterprise: Examples and Best Practices
Monitoring the Enterprise: Examples and Best Practices
 
Monitoring in the DevOps Era
Monitoring in the DevOps EraMonitoring in the DevOps Era
Monitoring in the DevOps Era
 
Ex1-2005 Large NSOC BMC Event Management Deployment
Ex1-2005 Large NSOC BMC Event Management DeploymentEx1-2005 Large NSOC BMC Event Management Deployment
Ex1-2005 Large NSOC BMC Event Management Deployment
 
Insights success The 10 Most Valuable Event Management Companies 2016
Insights success The 10 Most Valuable Event Management Companies 2016 Insights success The 10 Most Valuable Event Management Companies 2016
Insights success The 10 Most Valuable Event Management Companies 2016
 
Complete the Puzzle — Network Monitoring and Management with Entuity
Complete the Puzzle — Network Monitoring and Management with EntuityComplete the Puzzle — Network Monitoring and Management with Entuity
Complete the Puzzle — Network Monitoring and Management with Entuity
 
Ex2-2010 Large BMC Deployment
Ex2-2010 Large BMC DeploymentEx2-2010 Large BMC Deployment
Ex2-2010 Large BMC Deployment
 
58466507 event-management-best-practices-1-488
58466507 event-management-best-practices-1-48858466507 event-management-best-practices-1-488
58466507 event-management-best-practices-1-488
 
BMC Software proactive operations platform
BMC Software proactive operations platformBMC Software proactive operations platform
BMC Software proactive operations platform
 
Tech Talk: Mainframe Team Center—Event Management and Automation Sneak Peek
Tech Talk: Mainframe Team Center—Event Management and Automation Sneak Peek Tech Talk: Mainframe Team Center—Event Management and Automation Sneak Peek
Tech Talk: Mainframe Team Center—Event Management and Automation Sneak Peek
 
Network and IT Operations
Network and IT OperationsNetwork and IT Operations
Network and IT Operations
 
Jazz for Service Management - OMNIbus
Jazz for Service Management - OMNIbusJazz for Service Management - OMNIbus
Jazz for Service Management - OMNIbus
 
Top 10 Event Management Tips
Top 10 Event Management TipsTop 10 Event Management Tips
Top 10 Event Management Tips
 

Similaire à Event Management and Monitoring Strategy

Cyber+Capability+Toolkit+-+Cyber+Incident+Response+-+Cyber+Incident+Response+...
Cyber+Capability+Toolkit+-+Cyber+Incident+Response+-+Cyber+Incident+Response+...Cyber+Capability+Toolkit+-+Cyber+Incident+Response+-+Cyber+Incident+Response+...
Cyber+Capability+Toolkit+-+Cyber+Incident+Response+-+Cyber+Incident+Response+...
MaoTseTungBritoSilva1
 
Business Continuity Management.pdf
Business Continuity Management.pdfBusiness Continuity Management.pdf
Business Continuity Management.pdf
shanmuga13
 
u10a1-Risk Assessment Report-Beji Jacob
u10a1-Risk Assessment Report-Beji Jacobu10a1-Risk Assessment Report-Beji Jacob
u10a1-Risk Assessment Report-Beji Jacob
Beji Jacob
 
IT 552 Module Five Assignment Rubric The purpose of t.docx
IT 552 Module Five Assignment Rubric  The purpose of t.docxIT 552 Module Five Assignment Rubric  The purpose of t.docx
IT 552 Module Five Assignment Rubric The purpose of t.docx
christiandean12115
 
Rs live events_proj_mgmt_and_budgeting_draft
Rs live events_proj_mgmt_and_budgeting_draftRs live events_proj_mgmt_and_budgeting_draft
Rs live events_proj_mgmt_and_budgeting_draft
natalys
 
MCGlobalTech Cyber Capability Statement_Final
MCGlobalTech Cyber Capability Statement_FinalMCGlobalTech Cyber Capability Statement_Final
MCGlobalTech Cyber Capability Statement_Final
William McBorrough
 
CIS_Controls_v7.1_Implementation_Groups.pdf
CIS_Controls_v7.1_Implementation_Groups.pdfCIS_Controls_v7.1_Implementation_Groups.pdf
CIS_Controls_v7.1_Implementation_Groups.pdf
NesterWare
 

Similaire à Event Management and Monitoring Strategy (20)

Cyber+Capability+Toolkit+-+Cyber+Incident+Response+-+Cyber+Incident+Response+...
Cyber+Capability+Toolkit+-+Cyber+Incident+Response+-+Cyber+Incident+Response+...Cyber+Capability+Toolkit+-+Cyber+Incident+Response+-+Cyber+Incident+Response+...
Cyber+Capability+Toolkit+-+Cyber+Incident+Response+-+Cyber+Incident+Response+...
 
Event Management
Event ManagementEvent Management
Event Management
 
Business Continuity Management.pdf
Business Continuity Management.pdfBusiness Continuity Management.pdf
Business Continuity Management.pdf
 
Increasing the Probability of Success with Continuous Risk Management
Increasing the Probability of Success with Continuous Risk ManagementIncreasing the Probability of Success with Continuous Risk Management
Increasing the Probability of Success with Continuous Risk Management
 
Auditing application controls
Auditing application controlsAuditing application controls
Auditing application controls
 
u10a1-Risk Assessment Report-Beji Jacob
u10a1-Risk Assessment Report-Beji Jacobu10a1-Risk Assessment Report-Beji Jacob
u10a1-Risk Assessment Report-Beji Jacob
 
Cyber+incident+response+ +generic+ransomware+playbook+v2.3
Cyber+incident+response+ +generic+ransomware+playbook+v2.3Cyber+incident+response+ +generic+ransomware+playbook+v2.3
Cyber+incident+response+ +generic+ransomware+playbook+v2.3
 
The digital transformation in physical security
The digital transformation in physical security   The digital transformation in physical security
The digital transformation in physical security
 
Qatar Proposal
Qatar ProposalQatar Proposal
Qatar Proposal
 
ServiceNow Event Management
ServiceNow Event ManagementServiceNow Event Management
ServiceNow Event Management
 
Incident Management Best Practices
Incident Management Best PracticesIncident Management Best Practices
Incident Management Best Practices
 
IT 552 Module Five Assignment Rubric The purpose of t.docx
IT 552 Module Five Assignment Rubric  The purpose of t.docxIT 552 Module Five Assignment Rubric  The purpose of t.docx
IT 552 Module Five Assignment Rubric The purpose of t.docx
 
ISACA Belgium CERT view 2011
ISACA Belgium CERT view 2011ISACA Belgium CERT view 2011
ISACA Belgium CERT view 2011
 
Coordinating Security Response and Crisis Management Planning
Coordinating Security Response and Crisis Management PlanningCoordinating Security Response and Crisis Management Planning
Coordinating Security Response and Crisis Management Planning
 
Enterprise governance risk_compliance_fcm slides
Enterprise governance risk_compliance_fcm slidesEnterprise governance risk_compliance_fcm slides
Enterprise governance risk_compliance_fcm slides
 
Rs live events_proj_mgmt_and_budgeting_draft
Rs live events_proj_mgmt_and_budgeting_draftRs live events_proj_mgmt_and_budgeting_draft
Rs live events_proj_mgmt_and_budgeting_draft
 
Cybersecurity Incident Response Planning.pdf
Cybersecurity Incident Response Planning.pdfCybersecurity Incident Response Planning.pdf
Cybersecurity Incident Response Planning.pdf
 
MCGlobalTech Cyber Capability Statement_Final
MCGlobalTech Cyber Capability Statement_FinalMCGlobalTech Cyber Capability Statement_Final
MCGlobalTech Cyber Capability Statement_Final
 
Practical Guide to Managing Incidents Using LLM's and NLP.pdf
Practical Guide to Managing Incidents Using LLM's and NLP.pdfPractical Guide to Managing Incidents Using LLM's and NLP.pdf
Practical Guide to Managing Incidents Using LLM's and NLP.pdf
 
CIS_Controls_v7.1_Implementation_Groups.pdf
CIS_Controls_v7.1_Implementation_Groups.pdfCIS_Controls_v7.1_Implementation_Groups.pdf
CIS_Controls_v7.1_Implementation_Groups.pdf
 

Event Management and Monitoring Strategy

  • 1. Event Management and Monitoring Program Strategy Prepared by: Jim Gingras, Event Management and Monitoring Manager
  • 2. Event Managementand Monitoring Program Strategy Proprietary and Confidential 2 Table of Contents 1. Event Management and Monitoring Strategy.................................................................... 3 1.1 Event Management and Monitoring Overview ............................................................... 3 1.2 Stakeholders ................................................................................................................... 4 1.3 Event Management Program Processes........................................................................ 5 1.3.1 Event Management Process....................................................................................... 6 1.3.2 Event Monitoring ......................................................................................................... 7 1.3.3 Designing Manageable Applications .......................................................................... 8 1.4 Event Management Metrics............................................................................................ 8 1.5 Roadmap......................................................................................................................... 9 1.5.1 Current State to Future State.................................................................................... 10 Appendix A: ABusiness Value proposition for Event Management................................... 12
  • 3. Event Managementand Monitoring Program Strategy Proprietary and Confidential 3 1. Event Management and Monitoring Strategy The strategyfor EventManagementandMonitoringisto take advantage of the existingeventand monitoringprocessesandtoolsandbuildonthemto propel ITat CORPORATEto the nextlevelof IT capabilities,demonstratingbusinessvalue throughmanagementandmonitoringof ITservices. Appendix_A showsanexampleof howEventManagementdemonstratesbusinessvalue. The strategyinvolvesthe creation of anEventManagementprogramand the associatedprojects that are executed overthe nexttwoyears. The remainder of thisdocumentdescribesthe EventManagement andProgramandthe supporting activitiesrequiredtoensure it isoperatingasdesigned. These include:  Define EventManagementandMonitoring  Define the stakeholdersforeventmanagementandmonitoring  Define the highlevel processesandassociatedactivitiesinthe EventManagementProgram  Define metricsfordeterminingthe statusof the processes  Define aroadmapof the actionable andmeasureable projects/initiatives requiredtoestablish the eventmanagementprogram EventManagementProgram Definition: The EventManagementProgramisresponsible forthe managementof the EventManagementprojectsandmonitoringsystemsrequiredtodeterminethe statusof the servicesITprovides. EventManagementandMonitoringDefinition:EventManagementandMonitoringisthe processof managingIT systemandusereventstoprovide the appropriate control actionwhileprovidinganear real-time viewof the statusof the IT services. 1.1 EventManagementand MonitoringOverview EventManagement’svalue tothe businessis notdirectinthatit cannotgenerate income forthe business. The mostrelevantmeasurementstothe businessare:  DecreasedMeanTime To Repair– decreaseddowntime whenincidents/problemsoccurdue to the notificationof personnel withthe appropriate skill-level soonerandwiththe correct informationtoresolve issues,wheneverpossible,before theyoccur.  IncreasedMeanTime BetweenFailures –analyzingtrendedeventinformation todetermine upcomingoutagesandremediate thembefore theyoccur(predictivemonitoring)  Service Level Agreementsare metorexceeded –due todecreaseddowntime  DecreasedITsupportcost – due to appropriate personnel beingnotified,betteruse of knowledge fromeventsinthe environment,andfewerpersonnelrequiredtoresolve incidents/problems
  • 4. Event Managementand Monitoring Program Strategy Proprietary and Confidential 4 EventManagementisthe vital hubon whichall processand tool integrationisdeveloped. Event monitoringencompassesall the activitiesthatare requiredtoensure adevice or ConfigurationItem1 (CI) isworkingcorrectlyregardlessof whetheritisgeneratingevents. The foundational elementsforeventmanagement are the systemsandusereventsthatare createdby CIs or monitoringtools. Inordertoenable monitoringITservicesthese events are mappedtoall the relatedCIs of a specificITService.Goingforwardaservice view will be availabletoall managementand service/supportpersonneltoshowthe statusand configurationinformation inaneasyto understand format. 1.2 Stakeholders Position Name Description Event Management Event Management Process Activities IncidentManagement Automated Incident Management for events Problem Management Troubleshootingand enhancements for Known Errors Availability Management Monitoring Requirements Capacity Management MonitoringRequirements Operations MonitoringRequirements Steering Committee Program Management and Reporting IT Instrumentation MonitoringTools and Reporting InfrastructureHosting MonitoringRequirements, MonitoringTools Software Solutions and Support Systems Administration Architecture Instrumentation of Internal Applicationsand RJSF design and ServiceModel Security MonitoringRequirements for Security Service Management MonitoringRequirements Product Management Service Model and MonitoringRequirements Release Management Service Model Configuration Management Service Model 1 Configuration Items includeservices,applications,or components as per CORPORATE servicemodel in the CMDB
  • 5. Event Managementand Monitoring Program Strategy Proprietary and Confidential 5 Service Level Mgmt. Service Model and Service Level Requirements Software Solutions and Support Internal Software Development, RJSF Table 1: IT Event Management Stakeholders The stakeholdersforthe EventManagementProgramare managementandthe processownersforthe ITIL processesof availability,capacity,incident,problem,andeventmanagement. Additional stakeholdersincludethe administrative groupswhomustmanage the toolsthatare requiredtodeliver the eventmanagementservicesanddevelopmanageable applications.Allstakeholdersare requiredto agree on service viewsthatprovide accurate andrelevantservice statustoservice/supportpersonnelin supportof the business. 1.3 EventManagementProgramProcesses The Event Managementprogramisresponsibleforthe EventManagementprocessandforthe direction of the EventMonitoringenvironment. Italsointegrateswiththe ITILService Designprocessesof Availability,CapacityandSecurityManagementformonitoringrequirementsandcapabilities,andinthe IncidentandProblemManagementprocessesasinputsandoutputsforautomatedremediation or notificationactivitiesbasedon significantevents.EventManagementplaysasignificantrole inthe ContinuousServiceImprovementprocessesasa pointof research,auditandverification. ConsiderationsforEventManagement are alsorequiredaspartof applicationdevelopmentprocesses (e.g.RJSF),startingwithapplicationdesignand development.Additionally,the creationand managementof service basedviewsenablesthe nextgenerationof eventmonitoringforservice status events.
  • 6. Event Managementand Monitoring Program Strategy Proprietary and Confidential 6 1.3.1 Event Management Process The Event Managementprocessisthe processthatmonitorsall eventsthatoccur throughthe IT Infrastructure toallowfornormal operationandalsoto detectandescalate exceptionconditions. Figure 1: ITIL V3 Event Management Process The figure showsthatthe eventmanagementprocessisresponsible fordetection,filtering,triggering, alerting,automatedresponse andreviewingactions. The triggersandautomatedresponse will control the scope of the workrequiredbythe eventmanagementprocess. Inotherwords,the more triggers and automatedresponsesthatare required,the more workmustbe accomplishedtoautomate the response andincrease the businessvalue. One of the keystoa successful EventManagementprogramistodefine whichactionstriggerthe event managementprocessandmanagingthe numberandpriorityof those events. Triggersinclude:  Exceptionstoanylevel of ConfigurationItem(CI) performancedefinedindesignspecifications, SLAs,OLAs andSOPs  Exceptionstoan automatedprocedure orprocess – monitoringanautomatedworkflow  ExceptionwithinaITprocessthat isbeingmonitored –(e.g.serverbuild)  The completionof anautomatedtaskor job
  • 7. Event Managementand Monitoring Program Strategy Proprietary and Confidential 7  A statuschange in a device ordatabase record dependingonthe granularityof the monitoring requirements  Accessof an applicationordatabase bya useror automatedprocedure orjob  A situationwhere adevice,database,orapplication,orservice hasreachedapre-defined performance threshold. For thecurrent statethe mostimportantaspectof the EventManagementprocessisthat all typesof alerts will result in an incident being opened in the Service Desk. 1.3.2 Event Monitoring EventMonitoringcoversa broadspectrumof all the monitoringcapabilitiesacrossthe CORPORATEIT enterprise.The EventManagementarchitecture deployedtodayusesaManagerOf Managers(MOM) to gathereventsfromall the IT ManagementDomains. The majorIT Domainsare Application,Database, End User,Facilities,Network,Security,ServerPlatform(whichincludesvirtual),Storage,Telephony,and Workload. Figure 2: Manager of Managers Architecture Althoughall eventsare monitored,onlysignificanteventsare managedbecause theyare meaningful. Thisis accomplishedthroughfilteringatthe ITDomainlevel toidentifyeventsthatare recognizedas affectingthe statusof ConfigurationItems(CI) (i.e.Service,Application, andComponent),automation processesorothersignificantoccurrence. The Managerof Managers thencorrelatesthe eventsfrom each of the IT Domainsdeterminesthe course of actionandexecutesanautomatedresponse. Forall significanteventsanincidentwill automaticallybe opened,assignedandprioritizedinthe Service Desk. A majorportionof the EventManagementProgramincludes creatinginterfacesthatenable monitoring at the serviceslevel. The bestapproachisto start witha few significantservices todemonstrate the businessvalue of monitoringservices.
  • 8. Event Managementand Monitoring Program Strategy Proprietary and Confidential 8 1.3.3 Designing Manageable Applications In orderto optimize operational managementof applicationsthe applicationsmustbe designedwith operationsinmind. Thisrequires thatmonitoringrequirements are identifiedduringthe application designphase of the applicationlifecycle andinstrumentedduringthe applicationdevelopmentcycle. One of the keydeliverablesthatenablesthistype of monitoringisthe “healthmodel”whichrelatesthe statusof individual componentstothe statusof the overall applicationorservice. Forinternally developedapplications CORPORATEhasembraced the use of managementpacksasa meansof ensuring the supportabilityof applications. Thisinitiative isinline withMicrosoft’sDesignforOperations methodology. 1.4 EventManagementMetrics Once EventManagementisin place a baseline mustbe establishedastothe currentperformance levels and value tothe organizationintermsof optimizingoperationsactivitiesandMeanTime To Repair. The followingmetricsare recommendedbyITILv3: 1. Numberof eventspercategory – IT Domain,Service,Application 2. Numberof eventsbysignificance –Exception(Critical orMajor),Warning(Minor),or Informational (non-exception/warningapplicationmessages) 3. Numberandpercentage of eventsthatrequiredhumaninterventionandwhetherthiswas performed –incidentsare notopened 4. Numberandpercentage of eventsthatresulted inincidentsorchanges 5. Numberandpercentage of eventscausedbyexistingproblemsorKnownErrors 6. Numberandpercentage of replicatedorduplicatedevents 7. Numberandpercentage of eventsindicating performance issues 8. Numberandpercentage of eventsindicatingpotentialavailabilityissues 9. Numberandpercentage of eachtype of eventperplatformor application 10. Numberandratio of eventscomparedwiththe numberof incidents Furtherresearchmustbe done to determine how toderive thesemetricsandassociatedreportswith the existingmonitoringtools. Service Deskandthe Managerof Managers are goodplacesto beginthis work. These metricswill enablethe “tuning”of the eventmanagementsystemthroughthe adjustment of the filtersandcorrelationengine inthe domainmanagersandManagerof Managers,respectively.
  • 9. Event Managementand Monitoring Program Strategy Proprietary and Confidential 9 1.5 Roadmap Figure 3: Event Management Program Roadmap The highlevel roadmapforthe IT EventManagementProgram has eightprojects: 1. Define the EventManagement strategyand programincludingdeliverables: a. EventManagementStrategy b. EventManagementProcess c. EventHandlingPoliciesandStandards i. Notification/EscalationpoliciesandStandards d. EventManagementprojects/initiatives e. Eventmanagementprogramroadmap 2. Establish EventManagementProgramthrough: a. Ratificationof the eventmanagementandmonitoringprocessesandactivities i. Ratificationof eventhandlingpoliciesandstandards b. Communicate andgathersupportforeventmanagementprogramactivitiesin collaborationwithstakeholders toagree ondeliverables c. Establishatimeline forcompletingthe workactivities anddeliverables 3. Integrationwith ITILotherITIL managementprocessesincluding: a. IncidentManagementforautomation of incidentmanagementprocessactivitieswhere applicable. i. Automaticallymanage incidentsfromuserevents(transactions)
  • 10. Event Managementand Monitoring Program Strategy Proprietary and Confidential 10 ii. Automaticallymanage incidentsfromsystemevents iii. Automaticallymanage incidentsfrom service events b. AvailabilityManagement–foravailabilitymonitoringrequirementsof service components c. CapacityManagement – for the capacitymonitoringrequirementsof the service components d. ProblemManagement –for eventinformationinthe KnownErrorDatabase and for verificationandauditof the rootcause of problems. 4. Integrationwith applicationdesignanddevelopment –for internallydevelopedapplications throughIT architecture andSoftware Engineering a. Adoptionof managementpacksformonitoringapplications i. MicrosoftManagementPacksfor internallydevelopedapplicationsonthe Windowsplatform b. Propagationof configurationandstatusinformationtoservice views basedonthe service andhealthmodels. 5. Integrationwiththird-partyapplications a. Adoptionof managementpack methodologyformanagementof eventsfromthird- party applications i. Create deliverablesthatare platformdependent ii. Coordinate withinstrumentationandsystemsupportforinstrumentation lifecycle (design,develop,test,deploy) 6. Consolidation/Correlation of Domainlevel events a. Completionof integrationof critical,majorandminoreventsacrossall ITDomains to the Manager of Managers. b. Implementationof correlationpolicies/rulestoforwardsignificanteventsforincidents and alerts. 7. Integrationwiththe ITService/Supportgroupsthroughthe creationandmanagementof service viewsandrelatedconfigurationitems. a. Role basedservice dashboardsforusergroups 8. Continuousprocessimprovement a. Auditandverifyqualityandefficiencyof existingeventmanagement andmonitoring systemsandadjustfiltersandcorrelationenginestostreamlineautomation. 1.5.1 Current State to Future State EventManagementandMonitoringhas beeninplace foryearsat CORPORATE. It has maturedto a level where eventsare triggeringworkloadandotherautomation/remediation,aswell as,automated notification/escalation. Asfaras a maturitylevel, CORPORATEisbetweenreactive andproactive. There are specificcaseswhere we are atthe predictive level (monitoringbatch),butthisisthe exception. There are manymanagement/monitoringtoolsinplace acrossall the IT Domains. The two majortasks that mustbe accomplishedinorderforthe EventManagementprogramtobe successful are:
  • 11. Event Managementand Monitoring Program Strategy Proprietary and Confidential 11  Mature IT monitoringfromareactive/proactivelevel toaproactive/predictivematuritylevel throughautomationof eventresponses forall significantevents.  Consolidate andcorrelate all the eventsintomeaningful statusinformationfor CIslike applications,systems andITservices. The biggestenablergoingforwardisthe use of ITIL v3 as the frameworkformanagingIT. Thisprovides a commonvernacularand helpsestablish acceptedgovernance processesforeventmanagementand monitoring. Use of a frameworkcombinedwiththe use of the servicesconstructtorepresentITvalue to the business provides anewlevel of eventmanagementand monitoringforCORPORATE.
  • 12. Event Managementand Monitoring Program Strategy Proprietary and Confidential 12 Appendix A: A Business Value proposition for Event Management2 In simple termseventmanagementenablesreal timemonitoringof the infrastructure (i.e.listeningfor thingsthat are wrong),anduseseventcorrelationtofilter,de-duplicate andcombine eventstodetect more seriousissues. EventManagementisimportantbecause itwill:  Improve time toresolve throughcause identification  Improve visibilitytoreal time  Enable proactive managementof impacttothe business(ITcallsthe business)  Improve SecurityManagement Studiesshowthatfaultdetectionandroot-cause analysis are the mostimportantsystems management capabilities. Studiesalsoshowthatthe mosttime-consumingsystems managementtasksare diagnosis and troubleshooting. EventManagementenablesproactive responsestoeventsandenablesautomatic trackingand resolutionformost systemevents. The scenariosbelow show the difference whenevent managementisimplementedandwhenitisnot3 . 2 Taken from Data Network Event Management and ITIL, CISCO, Keith SInclair 3 The scenarios belowusea network device issueas the example. CORPORATE is monitoringall infrastructure domains atsome level as described in the Event monitoringsection of this document.
  • 13. Event Managementand Monitoring Program Strategy Proprietary and Confidential 13 Figure 4: Scenario Situation normal (w/o Event Management) Figure 5: Scenario - Situation with Event Management The bottom line isthatEventmanagementallowsITtoresolve issuesbefore the usersare affected. Armedwithreportsthatshowthe effectivenessof EventManagement,ITcanshow the businesshow effectivetheyare anddemonstrate real businessvalue. Appendix_A_Back