This document discusses disaster recovery best practices from Plan B Disaster Recovery Ltd. It emphasizes the importance of testing disaster recovery plans regularly, as most failures occur during initial tests. It also recommends automating recovery plans as much as possible and prioritizing a fast recovery time. The document discusses how Plan B's pre-recovery service can provide a tested hot standby system to enable recovery within minutes at a lower cost than traditional disaster recovery methods. It provides two case studies where Plan B's pre-recovery system successfully enabled fast recovery and minimal business disruption following an IT disaster.
2. Agenda
• Introduction
• Do you need IT Disaster Recovery protection?
• How best practice planning for IT disasters
• Why testing is so important
• Latest technologies
• Case studies
• Q&A
• Close
3. About Plan B Disaster Recovery
• Plan B is a specialist IT disaster recovery service provider.
• We have a 100% recovery record and a 100% customer satisfaction record.
• We guarantee instant recovery and availability of core business systems.
• We are technical leaders in our field pioneering Pre-recovery.
• Named Specialist Business Continuity Company of the year 2012
• We understand security & service management
• We protect large and small customers and simple and complex
environments
• We provide subcontract DR services to other service providers
4. Why have a robust disaster recovery provision?
• Disasters do happen and particularly to IT systems!
• IT disasters are unpredictable and can effect any
business
• Reliance on IT is so complete most businesses can’t
operate without it and business and reputational
damage quickly mount up
• 80% of businesses affected by a major incident close within 18
months*
• 90% of businesses that lose data from a disaster close within 2
years*
5. Why do plans and provisions often fail?
• No one wants to pay for DR.
• Tendency to be optimistic in DR planning –
supported by some suppliers. Nothing will
be as easy as was assumed.
• DR plans become complicated, lots of
manual requirements, difficult to build and
maintain and ruinously costly to test
properly – so they almost never are.
Compared to the costs of a disaster a good DR provision will seem very cheap
6. What is the nature of an IT Disaster?
• Like all catastrophes IT disasters usually are
the result of multiple failures as much as one
biblical event.
• Loss of IT systems will rapidly affect and
paralyse entire organisation.
• Extent of problem may be very hard to assess
and most information needed to recover will
be out of date or unavailable.
• The unexpected will happen.
• Uncertainty will prevent decisive business decision making.
• Many things will be harder than expected.
• Issues can multiply and crisis can quickly turn to chaos.
7. Best Practice in preparing a DR Plan
• Don’t start by assuming things will work.
• Focus on ‘Return to service’.
• Aim for as much certainty as possible – this will empower you
and enable you to take back the initiative.
• Keep your recovery plan as simple as possible – KISS
• Automate as much of your plan as possible.
• Aim for reasonable Speed of recovery – especially for
communications systems, enabling you to be more decisive
and effective.
• Prioritise testing as a fundamental control.
• Review and test as an on-going process
8. Testing – why is it so important?
• The only way to guarantee things will work. Many fail – will yours?
• Good testing will uncover un-known issues
• Our first test recoveries generate tens to hundreds of errors which need
manual intervention - this takes hours. If you haven’t carried out a full
recovery exercise then you will have long recovery times.
• High amount of change happening with IT systems so testing needs to
be done regularly.
• Tests should be full end to end recoveries, not just checking you can
read your backup discs.
9. Recovery times
Recovery times widely interpreted - ‘Time to respond’, ‘Time to get
a bootable solution’, ‘Time to meet minimum recovery objectives’
etc.
Focus on ‘Return to Service’
Clarify what recovery time means to you and set your own clear
business driven objectives to be met by suppliers.
Like all good SLAs the penalties for breaching should be meaningful.
IT downtime costs can be calculated using our simple IT downtime
calculator on our website. Is it cost effective to reduce your recovery
time further?
10. Virtual / Cloud based recovery services
• Virtualisation has made recovering to different hardware much
easier and enabled automation of the process
• Virtualisation has also made it practical to be able to syndicate
use of hardware and services whilst delivering a very high service
level.
• Cloud platforms potentially cut costs.
• Different offerings and different service levels
• Most use replication techniques run by the customer
• Testing still mostly manual
Affordable DR solutions that work are now possible but don’t be fooled –
easier doesn’t mean it will always work.
12. Benefits of Pre-recovery
• Guaranteed recovery within minutes
• Simplicity - set up, on-going protection, BCP Planning
• Very cost effective
• Minimum IT down-time - Minimum business disruption
• Simple invocation process
• Puts organisations back in control rapidly
• Handle IT disaster rather than be subject to it
• Protects from operational and reputational damage
13. Disasters do happen – SAN Failure
20th December 2013 – Financial services customer had a degraded
SAN that failed completely within 8 hours. Vendor was unable to fix
within guaranteed time.
Recovery systems available within 30 minutes. Users attached via
VPN immediately.
The customer ran on Plan B platform for 10 days.
Despite vendor eventually fixing the SAN customer no longer trusted
the HW and Plan B provided an ‘Export server’ solution as a
temporary local fix.
They were still running on the export server two months later.
• Catastrophic Hardware and to a lesser degree Software failures are a common cause of IT
disasters
• Plan B solution worked quickly when everything else had failed empowering company to start
solving the business issues.
• Approach enabled strategic approach for return to live
14. Disasters do happen – Fire
9pm, 15th April 2013 – Fire in Electrical riser caused
extensive damage to the building, total loss of electrical
infrastructure and smoke damage to IT equipment.
Recovery systems available within 40 minutes & 200 users
attached via VPNs next morning
The business continued to service their core customers
whilst IT staff started to recover live services.
The customer ran on Plan B’s recovery platform for 9
months.
They believe fire had no impact on full year results
• Pre-recovery worked quickly and easily providing all core communications and information systems
within minutes empowering company to immediately start solving the business issues.
• Many uncertainties and problems that would have made any form of ‘rebuild’ style of recovery too slow
and difficult.
• Approach enabled strategic approach for rebuild to live
15. Resources
• Guide to cost vs performance
• Pre-recovery whitepaper
• Disaster Recovery Survey Results
• DR Plan Template
• IT Downtime Calculator
www.planb.co.uk
@PlanB_DR
info@planb.co.uk