Contenu connexe Similaire à MTBF / MTTR - Energized Work TekTalk, Mar 2012 Similaire à MTBF / MTTR - Energized Work TekTalk, Mar 2012 (20) Plus de Energized Work (11) MTBF / MTTR - Energized Work TekTalk, Mar 20121. MTBF / MTTR
Availability or recoverability?
Presented by
Michael Richardson, Energized Work
21 March 2012
ENERGIZED WORK
25 MACKLIN STREET
LONDON WC2B 5NN
+44 (0)20 7691 8933
WWW.ENERGIZEDWORK.COM
2. Michael Richardson
Twitter: @mr_spb
Email: michael@energizedwork.com
#ewtektalk
© 2012 Energized Work - www.energizedwork.com 2
3. So what is
high availability?
• Five nines?
• No single point of failures?
• Multiple data centres?
• Fault tolerance?
• Load balancing?
• Uptime?
© 2012 Energized Work - www.energizedwork.com 3
5. Nines
of availability
Availability
Downtime per Year
One nine (90%)
36.5 days
Two nines (99%)
3.65 days
Three nines (99.9%)
8.76 hours
Four nines (99.99%)
52.56 minutes
Five nines (99.999%)
5.26 minutes
© 2012 Energized Work - www.energizedwork.com 5
6. Problem with
the nines
• What do they mean?
• Guaranteed or just an SLA?
• Multiplicity (99.9% * 99.9% * 99.9% = 99.7%)
© 2012 Energized Work - www.energizedwork.com 6
8. No single point of failure
(SPOF)
© 2012 Energized Work - www.energizedwork.com 8
10. Start with this
Users
Index.html
© 2012 Energized Work - www.energizedwork.com 10
11. End with this
Users
Firewall 1 Firewall 2
Switch 1 Switch 2
WEB1 WEB2 APP1 APP2 DB1 DB2
© 2012 Energized Work - www.energizedwork.com 11
12. Problems with
eliminating SPOF
• It’s expensive
• Where do you draw the line?
• Are failures independent?
• Can you guarantee no SPOF?
• Increased complexity
© 2012 Energized Work - www.energizedwork.com 12
15. Hot – Hot
multisite
• Full range of services available in multiple locations
• Easy to automate failover of sites
• Data consistency is hard
• Capacity planning concerns
+
© 2012 Energized Work - www.energizedwork.com 15
16. Hot – Warm
multisite
• Simpler than hot – hot
• Read / Write ratio dependent
• Synchronously or asynchronously replicate data?
+
© 2012 Energized Work - www.energizedwork.com 16
17. Hot – Cold
multisite
• Easy to setup
• Will it work?
• Can it be trusted?
• Cold site rapidly becomes stale
• Is it actually valuable?
+
© 2012 Energized Work - www.energizedwork.com 17
18. DR multisite
• Fingers crossed you never need it
• How can / should you test it?
• Cloud?
+
© 2012 Energized Work - www.energizedwork.com 18
19. Problems
with multiple sites
• It’s expensive
• Managing more systems
• Managing data consistency
• Managing capacity
• Is it still fail proof?
• Unless you test it, it’s just a plan
© 2012 Energized Work - www.energizedwork.com 19
20. We now have
a complex system
© 2012 Energized Work - www.energizedwork.com 20
21. Complex systems
• More redundancy and automation leads to more complexity
• More complexity often adds more points of failure
© 2012 Energized Work - www.energizedwork.com 21
22. How complex systems fail
- Dr. Richard Cook
• Catastrophe is always just around the corner
• Human operators have dual roles
• Change introduces new forms of failure
© 2012 Energized Work - www.energizedwork.com 22
24. Questions
for the business
• What is the cost of downtime?
• What are the Recovery Time Objectives (RTO)
• What are the Recovery Point Objectives (RPO)?
© 2012 Energized Work - www.energizedwork.com 24
25. Aggressive RTO and RPO
are expensive and have a
performance impact
© 2012 Energized Work - www.energizedwork.com 25
26. RTO / RPO
example
Problem:
• Simple DB
• Business can tolerate up to 15 minutes downtime
• 10-minute window of data loss
© 2012 Energized Work - www.energizedwork.com 26
27. RTO / RPO
example
Possible solution:
• Continuously replicate data to second host
• Continue with nightly backups and also copy DB transaction logs
from the primary host to another system
© 2012 Energized Work - www.energizedwork.com 27
28. So what is more important –
increasing availability
or reducing recovery time?
© 2012 Energized Work - www.energizedwork.com 28
33. License
This presentation is provided under the Creative Commons
Attribution Share Alike 3.0 Unported License.
You are free:
To share – to copy, distribute and transmit the work
To remix – to adapt the work
Under the following conditions:
Attribution – You must attribute the work in the manner specified by
Energized Work (but not in any way that suggests that Energized Work
endorse you or your use of the work).
Share Alike – If you alter, transform, or build upon this work, you may
distribute the resulting work only under the same or similar license to this
one.
ENERGIZED WORK
25 MACKLIN STREET
LONDON WC2B 5NN
+44 (0)20 7691 8933
© 2012 Energized Work - www.energizedwork.com WWW.ENERGIZEDWORK.COM
33