SlideShare une entreprise Scribd logo
1  sur  33
Télécharger pour lire hors ligne
MTBF / MTTR
                        Availability or recoverability?

                        Presented by 
                        Michael Richardson, Energized Work
                        21 March 2012




ENERGIZED WORK
25 MACKLIN STREET
LONDON WC2B 5NN
+44 (0)20 7691 8933
WWW.ENERGIZEDWORK.COM
Michael Richardson
                                                Twitter: @mr_spb
                                                




                                                Email: michael@energizedwork.com
                                                
                                                #ewtektalk
                                                




© 2012 Energized Work - www.energizedwork.com                                       2
So what is 
high availability?

•      Five nines?
•      No single point of failures?
•      Multiple data centres?
•      Fault tolerance?
•      Load balancing?
•      Uptime?




© 2012 Energized Work - www.energizedwork.com   3
Nines
of availability
                                                    9       9
                                                9
                             9
                                                        9
9
                                                9           9
© 2012 Energized Work - www.energizedwork.com                   4
Nines
of availability

                   Availability
                Downtime per Year
                   One nine (90%)
              36.5 days
                   Two nines (99%)
             3.65 days
                   Three nines (99.9%)
         8.76 hours
                   Four nines (99.99%)
         52.56 minutes
                   Five nines (99.999%)
        5.26 minutes




© 2012 Energized Work - www.energizedwork.com                        5
Problem with
the nines

•  What do they mean?
•  Guaranteed or just an SLA?
•  Multiplicity (99.9% * 99.9% * 99.9% = 99.7%)




© 2012 Energized Work - www.energizedwork.com      6
SLA availability numbers
just aim to provide a level of
confidence in a website’s service




© 2012 Energized Work - www.energizedwork.com   7
No single point of failure
(SPOF)




© 2012 Energized Work - www.energizedwork.com   8
Two of everything?




© 2012 Energized Work - www.energizedwork.com   9
Start with this

                                                 Users




                                                Index.html




© 2012 Energized Work - www.energizedwork.com                10
End with this
                                                     Users


                                       Firewall 1                   Firewall 2



                                        Switch 1                    Switch 2




               WEB1                   WEB2          APP1     APP2                DB1   DB2

© 2012 Energized Work - www.energizedwork.com                                                11
Problems with
eliminating SPOF

•      It’s expensive
•      Where do you draw the line?
•      Are failures independent?
•      Can you guarantee no SPOF?
•      Increased complexity




© 2012 Energized Work - www.energizedwork.com   12
Problem:
Data centres fail




© 2012 Energized Work - www.energizedwork.com   13
Solution:
Get a second data centre




© 2012 Energized Work - www.energizedwork.com   14
Hot – Hot
multisite

•      Full range of services available in multiple locations
•      Easy to automate failover of sites
•      Data consistency is hard
•      Capacity planning concerns



                                                       +


© 2012 Energized Work - www.energizedwork.com                    15
Hot – Warm
multisite

•  Simpler than hot – hot
•  Read / Write ratio dependent
•  Synchronously or asynchronously replicate data?




                                                 +


© 2012 Energized Work - www.energizedwork.com         16
Hot – Cold
multisite

•      Easy to setup
•      Will it work?
•      Can it be trusted?
•      Cold site rapidly becomes stale
•      Is it actually valuable?


                                                +


© 2012 Energized Work - www.energizedwork.com       17
DR multisite


•  Fingers crossed you never need it
•  How can / should you test it?
•  Cloud?




                                                +


© 2012 Energized Work - www.energizedwork.com       18
Problems
with multiple sites

•      It’s expensive
•      Managing more systems
•      Managing data consistency
•      Managing capacity
•      Is it still fail proof?
•      Unless you test it, it’s just a plan





© 2012 Energized Work - www.energizedwork.com   19
We now have
a complex system




© 2012 Energized Work - www.energizedwork.com   20
Complex systems


•  More redundancy and automation leads to more complexity
•  More complexity often adds more points of failure





© 2012 Energized Work - www.energizedwork.com                 21
How complex systems fail
 - Dr. Richard Cook


•  Catastrophe is always just around the corner
•  Human operators have dual roles
•  Change introduces new forms of failure





© 2012 Energized Work - www.energizedwork.com      22
Failure and recovery




© 2012 Energized Work - www.energizedwork.com   23
Questions
for the business

•  What is the cost of downtime?
•  What are the Recovery Time Objectives (RTO)
•  What are the Recovery Point Objectives (RPO)?




© 2012 Energized Work - www.energizedwork.com       24
Aggressive RTO and RPO
are expensive and have a
performance impact




© 2012 Energized Work - www.energizedwork.com   25
RTO / RPO
example

Problem:
•  Simple DB
•  Business can tolerate up to 15 minutes downtime
•  10-minute window of data loss




© 2012 Energized Work - www.energizedwork.com         26
RTO / RPO
example

Possible solution:
•  Continuously replicate data to second host
•  Continue with nightly backups and also copy DB transaction logs
   from the primary host to another system




© 2012 Energized Work - www.energizedwork.com                        27
So what is more important –
increasing availability
or reducing recovery time?





© 2012 Energized Work - www.energizedwork.com   28
MTBF or MTTR?


What about MTTD?




© 2012 Energized Work - www.energizedwork.com   29
The answer is:
It depends




© 2012 Energized Work - www.energizedwork.com   30
Failure
is inevitable




© 2012 Energized Work - www.energizedwork.com   31
Ask anyone




© 2012 Energized Work - www.energizedwork.com   32
License
This presentation is provided under the Creative Commons 
Attribution Share Alike 3.0 Unported License.

               You are free:
                 
               To share – to copy, distribute and transmit the work
               
               To remix – to adapt the work
               
               
               Under the following conditions:
               
               Attribution – You must attribute the work in the manner specified by 
               Energized Work (but not in any way that suggests that Energized Work 
               endorse you or your use of the work).
               
               Share Alike – If you alter, transform, or build upon this work, you may 
               distribute the resulting work only under the same or similar license to this 
               one. 

                                                                                                ENERGIZED WORK
                                                                                                25 MACKLIN STREET
                                                                                                LONDON WC2B 5NN
                                                                                                +44 (0)20 7691 8933
© 2012 Energized Work - www.energizedwork.com                                                   WWW.ENERGIZEDWORK.COM
                                                                                                                    33

Contenu connexe

Tendances

MTBF vs MTTR.pptx
MTBF vs MTTR.pptxMTBF vs MTTR.pptx
MTBF vs MTTR.pptxBalakumarV6
 
8 Steps To Success In Maintenance Planning And Scheduling
8 Steps To Success In Maintenance Planning And Scheduling8 Steps To Success In Maintenance Planning And Scheduling
8 Steps To Success In Maintenance Planning And SchedulingRicky Smith CMRP, CMRT
 
Maintenance Management
Maintenance ManagementMaintenance Management
Maintenance ManagementVijay325
 
Principles and practices of maintenance planning
Principles and practices of maintenance planningPrinciples and practices of maintenance planning
Principles and practices of maintenance planningMudit M. Saxena
 
What is total productive maintenance?
What is total productive maintenance?What is total productive maintenance?
What is total productive maintenance?ThreadSol
 
Maintenance strategy
Maintenance strategyMaintenance strategy
Maintenance strategygumma alsgier
 
General Maintenance
General MaintenanceGeneral Maintenance
General Maintenancepradhyot05
 
Maintenance Strategy, Types of Maintenance
Maintenance Strategy, Types of MaintenanceMaintenance Strategy, Types of Maintenance
Maintenance Strategy, Types of MaintenanceDhanesh S
 
Machine maintenance presentation
Machine maintenance presentationMachine maintenance presentation
Machine maintenance presentationhimu_kamrul
 
Best Practices in Maintenance and Reliability
Best Practices in Maintenance and ReliabilityBest Practices in Maintenance and Reliability
Best Practices in Maintenance and ReliabilityRicky Smith CMRP, CMRT
 
Reliability centred maintenance
Reliability centred maintenanceReliability centred maintenance
Reliability centred maintenanceSHIVAJI CHOUDHURY
 
Total productive maintenance(TPM)
Total productive maintenance(TPM)Total productive maintenance(TPM)
Total productive maintenance(TPM)Md.Muzahid Khan
 
Preventive Maintenance Presentation
Preventive Maintenance PresentationPreventive Maintenance Presentation
Preventive Maintenance Presentationimhoffm
 
Reliability Centred Maintenance Presentation
Reliability Centred Maintenance PresentationReliability Centred Maintenance Presentation
Reliability Centred Maintenance PresentationAndy_Watson_Sim
 
Maintenance Management (presentation)
Maintenance Management (presentation)Maintenance Management (presentation)
Maintenance Management (presentation)kabul university
 

Tendances (20)

MTBF vs MTTR.pptx
MTBF vs MTTR.pptxMTBF vs MTTR.pptx
MTBF vs MTTR.pptx
 
Plant maintenance
Plant maintenancePlant maintenance
Plant maintenance
 
8 Steps To Success In Maintenance Planning And Scheduling
8 Steps To Success In Maintenance Planning And Scheduling8 Steps To Success In Maintenance Planning And Scheduling
8 Steps To Success In Maintenance Planning And Scheduling
 
Maintenance Management
Maintenance ManagementMaintenance Management
Maintenance Management
 
Principles and practices of maintenance planning
Principles and practices of maintenance planningPrinciples and practices of maintenance planning
Principles and practices of maintenance planning
 
What is total productive maintenance?
What is total productive maintenance?What is total productive maintenance?
What is total productive maintenance?
 
Maintenance strategy
Maintenance strategyMaintenance strategy
Maintenance strategy
 
Unit-1 ME 6012
Unit-1 ME 6012Unit-1 ME 6012
Unit-1 ME 6012
 
General Maintenance
General MaintenanceGeneral Maintenance
General Maintenance
 
Reliability centered maintenance
Reliability centered maintenanceReliability centered maintenance
Reliability centered maintenance
 
Maintenance Strategy, Types of Maintenance
Maintenance Strategy, Types of MaintenanceMaintenance Strategy, Types of Maintenance
Maintenance Strategy, Types of Maintenance
 
Machine maintenance presentation
Machine maintenance presentationMachine maintenance presentation
Machine maintenance presentation
 
Best Practices in Maintenance and Reliability
Best Practices in Maintenance and ReliabilityBest Practices in Maintenance and Reliability
Best Practices in Maintenance and Reliability
 
Reliability centred maintenance
Reliability centred maintenanceReliability centred maintenance
Reliability centred maintenance
 
Total productive maintenance(TPM)
Total productive maintenance(TPM)Total productive maintenance(TPM)
Total productive maintenance(TPM)
 
Preventive Maintenance Presentation
Preventive Maintenance PresentationPreventive Maintenance Presentation
Preventive Maintenance Presentation
 
Reliability Centred Maintenance Presentation
Reliability Centred Maintenance PresentationReliability Centred Maintenance Presentation
Reliability Centred Maintenance Presentation
 
3..maintenance management
3..maintenance management3..maintenance management
3..maintenance management
 
Maintenance Management (presentation)
Maintenance Management (presentation)Maintenance Management (presentation)
Maintenance Management (presentation)
 
MAINTENANCE.ppt
MAINTENANCE.pptMAINTENANCE.ppt
MAINTENANCE.ppt
 

En vedette

Reliability - Availability
Reliability -  AvailabilityReliability -  Availability
Reliability - AvailabilityTom Jacyszyn
 
Dev and Ops Collaboration and Awareness at Etsy and Flickr
Dev and Ops Collaboration and Awareness at Etsy and FlickrDev and Ops Collaboration and Awareness at Etsy and Flickr
Dev and Ops Collaboration and Awareness at Etsy and FlickrJohn Allspaw
 
Metastability,MTBF,synchronizer & synchronizer failure
Metastability,MTBF,synchronizer & synchronizer failureMetastability,MTBF,synchronizer & synchronizer failure
Metastability,MTBF,synchronizer & synchronizer failureprashant singh
 
A Proposal for an Alternative to MTBF/MTTF
A Proposal for an Alternative to MTBF/MTTFA Proposal for an Alternative to MTBF/MTTF
A Proposal for an Alternative to MTBF/MTTFASQ Reliability Division
 
Reducing MTTR and False Escalations: Event Correlation at LinkedIn
Reducing MTTR and False Escalations: Event Correlation at LinkedInReducing MTTR and False Escalations: Event Correlation at LinkedIn
Reducing MTTR and False Escalations: Event Correlation at LinkedInMichael Kehoe
 
Principles of RF Microwave Power Measurement
Principles of RF Microwave Power MeasurementPrinciples of RF Microwave Power Measurement
Principles of RF Microwave Power MeasurementRobert Kirchhoefer
 
Rf power measurement
Rf power measurement Rf power measurement
Rf power measurement ruwaghmare
 
Alternatives to MTBF
Alternatives to MTBF Alternatives to MTBF
Alternatives to MTBF Craig Hillman
 
راه اندازی ویدئو پروژکتور در قدیمی ترین دبیرستان تهران - دبیرستان علوی
راه اندازی ویدئو پروژکتور در قدیمی ترین دبیرستان تهران - دبیرستان علویراه اندازی ویدئو پروژکتور در قدیمی ترین دبیرستان تهران - دبیرستان علوی
راه اندازی ویدئو پروژکتور در قدیمی ترین دبیرستان تهران - دبیرستان علویشرکت مهندسی نوآوران تحقیق
 
Dfr Presentation
Dfr   PresentationDfr   Presentation
Dfr Presentationeraz
 

En vedette (20)

Reliability - Availability
Reliability -  AvailabilityReliability -  Availability
Reliability - Availability
 
MTTR
MTTRMTTR
MTTR
 
Dev and Ops Collaboration and Awareness at Etsy and Flickr
Dev and Ops Collaboration and Awareness at Etsy and FlickrDev and Ops Collaboration and Awareness at Etsy and Flickr
Dev and Ops Collaboration and Awareness at Etsy and Flickr
 
Reliability engineering ppt-Internship
Reliability engineering ppt-InternshipReliability engineering ppt-Internship
Reliability engineering ppt-Internship
 
mtbf
mtbfmtbf
mtbf
 
mttr
mttrmttr
mttr
 
Misuses of MTBF
Misuses of MTBFMisuses of MTBF
Misuses of MTBF
 
Metastability,MTBF,synchronizer & synchronizer failure
Metastability,MTBF,synchronizer & synchronizer failureMetastability,MTBF,synchronizer & synchronizer failure
Metastability,MTBF,synchronizer & synchronizer failure
 
Overview and Basic Maintenance
Overview and Basic MaintenanceOverview and Basic Maintenance
Overview and Basic Maintenance
 
A Proposal for an Alternative to MTBF/MTTF
A Proposal for an Alternative to MTBF/MTTFA Proposal for an Alternative to MTBF/MTTF
A Proposal for an Alternative to MTBF/MTTF
 
Reducing MTTR and False Escalations: Event Correlation at LinkedIn
Reducing MTTR and False Escalations: Event Correlation at LinkedInReducing MTTR and False Escalations: Event Correlation at LinkedIn
Reducing MTTR and False Escalations: Event Correlation at LinkedIn
 
Principles of RF Microwave Power Measurement
Principles of RF Microwave Power MeasurementPrinciples of RF Microwave Power Measurement
Principles of RF Microwave Power Measurement
 
Rf power measurement
Rf power measurement Rf power measurement
Rf power measurement
 
Alternatives to MTBF
Alternatives to MTBF Alternatives to MTBF
Alternatives to MTBF
 
Trapped by MTBF
Trapped by MTBFTrapped by MTBF
Trapped by MTBF
 
پروژه ویدئو کنفرانس شرکت پارس حیات
پروژه ویدئو کنفرانس شرکت پارس حیاتپروژه ویدئو کنفرانس شرکت پارس حیات
پروژه ویدئو کنفرانس شرکت پارس حیات
 
Ltx 2003 q1_kpi
Ltx 2003 q1_kpiLtx 2003 q1_kpi
Ltx 2003 q1_kpi
 
راه اندازی ویدئو پروژکتور در قدیمی ترین دبیرستان تهران - دبیرستان علوی
راه اندازی ویدئو پروژکتور در قدیمی ترین دبیرستان تهران - دبیرستان علویراه اندازی ویدئو پروژکتور در قدیمی ترین دبیرستان تهران - دبیرستان علوی
راه اندازی ویدئو پروژکتور در قدیمی ترین دبیرستان تهران - دبیرستان علوی
 
Sf6 gas properties
Sf6 gas propertiesSf6 gas properties
Sf6 gas properties
 
Dfr Presentation
Dfr   PresentationDfr   Presentation
Dfr Presentation
 

Similaire à MTBF / MTTR - Energized Work TekTalk, Mar 2012

System Availability Talk
System Availability TalkSystem Availability Talk
System Availability Talkm_richardson
 
Disaster Recovery with MySQL and Tungsten
Disaster Recovery with MySQL and TungstenDisaster Recovery with MySQL and Tungsten
Disaster Recovery with MySQL and TungstenJeff Mace
 
Specifics of Managing Large, Complex Projects
Specifics of Managing Large, Complex ProjectsSpecifics of Managing Large, Complex Projects
Specifics of Managing Large, Complex ProjectsJeremie Averous
 
Getting Started Developing with Platform as a Service
Getting Started Developing with Platform as a ServiceGetting Started Developing with Platform as a Service
Getting Started Developing with Platform as a ServiceCloudBees
 
Large Complex Projects (PMI-MY presentation Sept 2012)
Large Complex Projects (PMI-MY presentation Sept 2012)Large Complex Projects (PMI-MY presentation Sept 2012)
Large Complex Projects (PMI-MY presentation Sept 2012)Jeremie Averous
 
Paremus Cloud and OSGi Beyond the VM - OSGi Cloud Workshop March 2012
Paremus Cloud and OSGi Beyond the VM - OSGi Cloud Workshop March 2012Paremus Cloud and OSGi Beyond the VM - OSGi Cloud Workshop March 2012
Paremus Cloud and OSGi Beyond the VM - OSGi Cloud Workshop March 2012mfrancis
 
How to Plan and Budget for 2013 with Cloud in Mind
How to Plan and Budget for 2013 with Cloud in MindHow to Plan and Budget for 2013 with Cloud in Mind
How to Plan and Budget for 2013 with Cloud in MindBluelock
 
2012 Annual State of the Union for Mobile Ecommerce Performance [Velocity EU]
2012 Annual State of the Union for Mobile Ecommerce Performance [Velocity EU]2012 Annual State of the Union for Mobile Ecommerce Performance [Velocity EU]
2012 Annual State of the Union for Mobile Ecommerce Performance [Velocity EU]Strangeloop
 
JVM Multitenancy (JavaOne 2012)
JVM Multitenancy (JavaOne 2012)JVM Multitenancy (JavaOne 2012)
JVM Multitenancy (JavaOne 2012)Graeme_IBM
 
Diving Deeper into DevOps Deployments
Diving Deeper into DevOps DeploymentsDiving Deeper into DevOps Deployments
Diving Deeper into DevOps DeploymentsJules Pierre-Louis
 
Building Agile Data Warehouses with Ralph Hughes
Building Agile Data Warehouses with Ralph HughesBuilding Agile Data Warehouses with Ralph Hughes
Building Agile Data Warehouses with Ralph HughesKalido
 
Get Loose! Microservices and Loosely Coupled Architectures
Get Loose! Microservices and Loosely Coupled ArchitecturesGet Loose! Microservices and Loosely Coupled Architectures
Get Loose! Microservices and Loosely Coupled ArchitecturesDeborah Schalm
 
Get Loose! Microservices and Loosely Coupled Architectures
Get Loose! Microservices and Loosely Coupled Architectures Get Loose! Microservices and Loosely Coupled Architectures
Get Loose! Microservices and Loosely Coupled Architectures DevOps.com
 
Virtual Worlds: A Future History
Virtual Worlds: A Future HistoryVirtual Worlds: A Future History
Virtual Worlds: A Future HistoryRobin Teigland
 
Why the Cloud matters for Encoding
Why the Cloud matters for EncodingWhy the Cloud matters for Encoding
Why the Cloud matters for EncodingBrightcove
 
Scaling mature systems
Scaling mature systemsScaling mature systems
Scaling mature systemsHanMorten
 
The 10 biggest metering and billing mistakes
The 10 biggest metering and billing mistakesThe 10 biggest metering and billing mistakes
The 10 biggest metering and billing mistakesFlexiant
 
Micro frontends with react and redux dev day
Micro frontends with react and redux   dev dayMicro frontends with react and redux   dev day
Micro frontends with react and redux dev dayPrasanna Venkatesan
 
10 Do’s for DevOps!
 10 Do’s for DevOps!  10 Do’s for DevOps!
10 Do’s for DevOps! DevOps.com
 

Similaire à MTBF / MTTR - Energized Work TekTalk, Mar 2012 (20)

System Availability Talk
System Availability TalkSystem Availability Talk
System Availability Talk
 
Disaster Recovery with MySQL and Tungsten
Disaster Recovery with MySQL and TungstenDisaster Recovery with MySQL and Tungsten
Disaster Recovery with MySQL and Tungsten
 
Specifics of Managing Large, Complex Projects
Specifics of Managing Large, Complex ProjectsSpecifics of Managing Large, Complex Projects
Specifics of Managing Large, Complex Projects
 
Getting Started Developing with Platform as a Service
Getting Started Developing with Platform as a ServiceGetting Started Developing with Platform as a Service
Getting Started Developing with Platform as a Service
 
Large Complex Projects (PMI-MY presentation Sept 2012)
Large Complex Projects (PMI-MY presentation Sept 2012)Large Complex Projects (PMI-MY presentation Sept 2012)
Large Complex Projects (PMI-MY presentation Sept 2012)
 
Paremus Cloud and OSGi Beyond the VM - OSGi Cloud Workshop March 2012
Paremus Cloud and OSGi Beyond the VM - OSGi Cloud Workshop March 2012Paremus Cloud and OSGi Beyond the VM - OSGi Cloud Workshop March 2012
Paremus Cloud and OSGi Beyond the VM - OSGi Cloud Workshop March 2012
 
How to Plan and Budget for 2013 with Cloud in Mind
How to Plan and Budget for 2013 with Cloud in MindHow to Plan and Budget for 2013 with Cloud in Mind
How to Plan and Budget for 2013 with Cloud in Mind
 
MySQL vs NoSQL
MySQL vs NoSQLMySQL vs NoSQL
MySQL vs NoSQL
 
2012 Annual State of the Union for Mobile Ecommerce Performance [Velocity EU]
2012 Annual State of the Union for Mobile Ecommerce Performance [Velocity EU]2012 Annual State of the Union for Mobile Ecommerce Performance [Velocity EU]
2012 Annual State of the Union for Mobile Ecommerce Performance [Velocity EU]
 
JVM Multitenancy (JavaOne 2012)
JVM Multitenancy (JavaOne 2012)JVM Multitenancy (JavaOne 2012)
JVM Multitenancy (JavaOne 2012)
 
Diving Deeper into DevOps Deployments
Diving Deeper into DevOps DeploymentsDiving Deeper into DevOps Deployments
Diving Deeper into DevOps Deployments
 
Building Agile Data Warehouses with Ralph Hughes
Building Agile Data Warehouses with Ralph HughesBuilding Agile Data Warehouses with Ralph Hughes
Building Agile Data Warehouses with Ralph Hughes
 
Get Loose! Microservices and Loosely Coupled Architectures
Get Loose! Microservices and Loosely Coupled ArchitecturesGet Loose! Microservices and Loosely Coupled Architectures
Get Loose! Microservices and Loosely Coupled Architectures
 
Get Loose! Microservices and Loosely Coupled Architectures
Get Loose! Microservices and Loosely Coupled Architectures Get Loose! Microservices and Loosely Coupled Architectures
Get Loose! Microservices and Loosely Coupled Architectures
 
Virtual Worlds: A Future History
Virtual Worlds: A Future HistoryVirtual Worlds: A Future History
Virtual Worlds: A Future History
 
Why the Cloud matters for Encoding
Why the Cloud matters for EncodingWhy the Cloud matters for Encoding
Why the Cloud matters for Encoding
 
Scaling mature systems
Scaling mature systemsScaling mature systems
Scaling mature systems
 
The 10 biggest metering and billing mistakes
The 10 biggest metering and billing mistakesThe 10 biggest metering and billing mistakes
The 10 biggest metering and billing mistakes
 
Micro frontends with react and redux dev day
Micro frontends with react and redux   dev dayMicro frontends with react and redux   dev day
Micro frontends with react and redux dev day
 
10 Do’s for DevOps!
 10 Do’s for DevOps!  10 Do’s for DevOps!
10 Do’s for DevOps!
 

Plus de Energized Work

Agile Practitioners Feedback to improve teams
Agile Practitioners Feedback to improve teamsAgile Practitioners Feedback to improve teams
Agile Practitioners Feedback to improve teamsEnergized Work
 
Experience report on agile tools for management teams
Experience report on agile tools for management teamsExperience report on agile tools for management teams
Experience report on agile tools for management teamsEnergized Work
 
Business model innovation by experimentation
Business model innovation by experimentationBusiness model innovation by experimentation
Business model innovation by experimentationEnergized Work
 
Debugging Grails Database Performance
Debugging Grails Database PerformanceDebugging Grails Database Performance
Debugging Grails Database PerformanceEnergized Work
 
Governance - Friend or Foe?
Governance - Friend or Foe?Governance - Friend or Foe?
Governance - Friend or Foe?Energized Work
 
Energized Work: Software that means business
Energized Work: Software that means businessEnergized Work: Software that means business
Energized Work: Software that means businessEnergized Work
 
Product Development in the Land of the Free - Energized Work Presentation
Product Development in the Land of the Free - Energized Work PresentationProduct Development in the Land of the Free - Energized Work Presentation
Product Development in the Land of the Free - Energized Work PresentationEnergized Work
 
Leaning - Energized Work Presentation
Leaning - Energized Work PresentationLeaning - Energized Work Presentation
Leaning - Energized Work PresentationEnergized Work
 
Concept to Cash - Energized Work Presentation
Concept to Cash - Energized Work PresentationConcept to Cash - Energized Work Presentation
Concept to Cash - Energized Work PresentationEnergized Work
 

Plus de Energized Work (11)

Agile Practitioners Feedback to improve teams
Agile Practitioners Feedback to improve teamsAgile Practitioners Feedback to improve teams
Agile Practitioners Feedback to improve teams
 
Surviving SOA
Surviving SOASurviving SOA
Surviving SOA
 
Experience report on agile tools for management teams
Experience report on agile tools for management teamsExperience report on agile tools for management teams
Experience report on agile tools for management teams
 
Innovation Governance
Innovation GovernanceInnovation Governance
Innovation Governance
 
Business model innovation by experimentation
Business model innovation by experimentationBusiness model innovation by experimentation
Business model innovation by experimentation
 
Debugging Grails Database Performance
Debugging Grails Database PerformanceDebugging Grails Database Performance
Debugging Grails Database Performance
 
Governance - Friend or Foe?
Governance - Friend or Foe?Governance - Friend or Foe?
Governance - Friend or Foe?
 
Energized Work: Software that means business
Energized Work: Software that means businessEnergized Work: Software that means business
Energized Work: Software that means business
 
Product Development in the Land of the Free - Energized Work Presentation
Product Development in the Land of the Free - Energized Work PresentationProduct Development in the Land of the Free - Energized Work Presentation
Product Development in the Land of the Free - Energized Work Presentation
 
Leaning - Energized Work Presentation
Leaning - Energized Work PresentationLeaning - Energized Work Presentation
Leaning - Energized Work Presentation
 
Concept to Cash - Energized Work Presentation
Concept to Cash - Energized Work PresentationConcept to Cash - Energized Work Presentation
Concept to Cash - Energized Work Presentation
 

Dernier

Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 

Dernier (20)

Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 

MTBF / MTTR - Energized Work TekTalk, Mar 2012

  • 1. MTBF / MTTR Availability or recoverability? Presented by Michael Richardson, Energized Work 21 March 2012 ENERGIZED WORK 25 MACKLIN STREET LONDON WC2B 5NN +44 (0)20 7691 8933 WWW.ENERGIZEDWORK.COM
  • 2. Michael Richardson Twitter: @mr_spb Email: michael@energizedwork.com #ewtektalk © 2012 Energized Work - www.energizedwork.com 2
  • 3. So what is high availability? •  Five nines? •  No single point of failures? •  Multiple data centres? •  Fault tolerance? •  Load balancing? •  Uptime? © 2012 Energized Work - www.energizedwork.com 3
  • 4. Nines of availability 9 9 9 9 9 9 9 9 © 2012 Energized Work - www.energizedwork.com 4
  • 5. Nines of availability Availability Downtime per Year One nine (90%) 36.5 days Two nines (99%) 3.65 days Three nines (99.9%) 8.76 hours Four nines (99.99%) 52.56 minutes Five nines (99.999%) 5.26 minutes © 2012 Energized Work - www.energizedwork.com 5
  • 6. Problem with the nines •  What do they mean? •  Guaranteed or just an SLA? •  Multiplicity (99.9% * 99.9% * 99.9% = 99.7%) © 2012 Energized Work - www.energizedwork.com 6
  • 7. SLA availability numbers just aim to provide a level of confidence in a website’s service © 2012 Energized Work - www.energizedwork.com 7
  • 8. No single point of failure (SPOF) © 2012 Energized Work - www.energizedwork.com 8
  • 9. Two of everything? © 2012 Energized Work - www.energizedwork.com 9
  • 10. Start with this Users Index.html © 2012 Energized Work - www.energizedwork.com 10
  • 11. End with this Users Firewall 1 Firewall 2 Switch 1 Switch 2 WEB1 WEB2 APP1 APP2 DB1 DB2 © 2012 Energized Work - www.energizedwork.com 11
  • 12. Problems with eliminating SPOF •  It’s expensive •  Where do you draw the line? •  Are failures independent? •  Can you guarantee no SPOF? •  Increased complexity © 2012 Energized Work - www.energizedwork.com 12
  • 13. Problem: Data centres fail © 2012 Energized Work - www.energizedwork.com 13
  • 14. Solution: Get a second data centre © 2012 Energized Work - www.energizedwork.com 14
  • 15. Hot – Hot multisite •  Full range of services available in multiple locations •  Easy to automate failover of sites •  Data consistency is hard •  Capacity planning concerns + © 2012 Energized Work - www.energizedwork.com 15
  • 16. Hot – Warm multisite •  Simpler than hot – hot •  Read / Write ratio dependent •  Synchronously or asynchronously replicate data? + © 2012 Energized Work - www.energizedwork.com 16
  • 17. Hot – Cold multisite •  Easy to setup •  Will it work? •  Can it be trusted? •  Cold site rapidly becomes stale •  Is it actually valuable? + © 2012 Energized Work - www.energizedwork.com 17
  • 18. DR multisite •  Fingers crossed you never need it •  How can / should you test it? •  Cloud? + © 2012 Energized Work - www.energizedwork.com 18
  • 19. Problems with multiple sites •  It’s expensive •  Managing more systems •  Managing data consistency •  Managing capacity •  Is it still fail proof? •  Unless you test it, it’s just a plan © 2012 Energized Work - www.energizedwork.com 19
  • 20. We now have a complex system © 2012 Energized Work - www.energizedwork.com 20
  • 21. Complex systems •  More redundancy and automation leads to more complexity •  More complexity often adds more points of failure © 2012 Energized Work - www.energizedwork.com 21
  • 22. How complex systems fail - Dr. Richard Cook •  Catastrophe is always just around the corner •  Human operators have dual roles •  Change introduces new forms of failure © 2012 Energized Work - www.energizedwork.com 22
  • 23. Failure and recovery © 2012 Energized Work - www.energizedwork.com 23
  • 24. Questions for the business •  What is the cost of downtime? •  What are the Recovery Time Objectives (RTO) •  What are the Recovery Point Objectives (RPO)? © 2012 Energized Work - www.energizedwork.com 24
  • 25. Aggressive RTO and RPO are expensive and have a performance impact © 2012 Energized Work - www.energizedwork.com 25
  • 26. RTO / RPO example Problem: •  Simple DB •  Business can tolerate up to 15 minutes downtime •  10-minute window of data loss © 2012 Energized Work - www.energizedwork.com 26
  • 27. RTO / RPO example Possible solution: •  Continuously replicate data to second host •  Continue with nightly backups and also copy DB transaction logs from the primary host to another system © 2012 Energized Work - www.energizedwork.com 27
  • 28. So what is more important – increasing availability or reducing recovery time? © 2012 Energized Work - www.energizedwork.com 28
  • 29. MTBF or MTTR? What about MTTD? © 2012 Energized Work - www.energizedwork.com 29
  • 30. The answer is: It depends © 2012 Energized Work - www.energizedwork.com 30
  • 31. Failure is inevitable © 2012 Energized Work - www.energizedwork.com 31
  • 32. Ask anyone © 2012 Energized Work - www.energizedwork.com 32
  • 33. License This presentation is provided under the Creative Commons Attribution Share Alike 3.0 Unported License. You are free: To share – to copy, distribute and transmit the work To remix – to adapt the work Under the following conditions: Attribution – You must attribute the work in the manner specified by Energized Work (but not in any way that suggests that Energized Work endorse you or your use of the work). Share Alike – If you alter, transform, or build upon this work, you may distribute the resulting work only under the same or similar license to this one. ENERGIZED WORK 25 MACKLIN STREET LONDON WC2B 5NN +44 (0)20 7691 8933 © 2012 Energized Work - www.energizedwork.com WWW.ENERGIZEDWORK.COM 33