5. Servers Over 200 standalone Virtualisation – 200 into 20 will go ! 9 new Host Servers, holding 155 Virtual Servers Power Savings Space Savings Resilience ??
6. Storage 60TB of data (100,000 CDs) 10GB per staff Resilience ??
17. Lets Build Another One..! Luverly ! Production Line .. bit by bit ....
18. Now to Restore Services ! University Gold Team (Chaired by the VC) Business Continuity and Recovery Prioritising Services Tracking Progress Communicating Regular meetings, 29 Nov to 15 Dec ISD Contingency Team Recovery and Business Continuity Mapping Service Dependencies Managing Resources (people, procurement, time) Directing operations Dealing with Insurance Claim Lots of staff involved Everyone in the department had a part to play.
19. Now to Restore Services ! Scale of Operation 165 Servers destroyed 121 Live Services Core Services – 39 (Telephone, Web Site, Email, VLE...) Non Core Services – 82 (Tills, HR, Invoicing...) 20 Test & Development Environments Process Cleaning the room and salvaging equipment Limiting further risk by removing the cause Identifying what services were working (not working) Recovering services by alternative means (where we could) Procuring equipment prior to the rebuild Building a new server infrastructure Recovering services by priority Keeping the Gold Team informed
21. What Next ? Options Paper DISAG Independent Review Prof David Baker Secondary Server Room External Services?
22. Lessons Learnt – Management Perspective. People Successful recovery is based on staff goodwill, commitment, professionalism. Having and maintaining good relationships with suppliers. Having a strong recovery team with management, operational and administration experience. Having the Gold team to agree priorities. Everyone wants to help! Communications Having a contacts list to get hold of key staff, and key suppliers. People are patient and will wait for their systems if they understand the situation The value of having a staff and student portal (especially when you don’t have it!) The value of Facebook to get messages out to staff and students. Sharing personal emails and mobile phone numbers to ease communication. Communicating ‘what is happening with the recovery process’ is important for your own department staff. Tempering expectations by communicating the right message to the organisation and customers.
23. Lessons Learnt – Management Perspective. Inventory Keeping an itemised list of parts of equipment held in your Data Centre will allow you to replace equipment quickly. Having a list of core services and their dependencies so that you can agree priorities for restoring. Resilience Don’t put all your eggs in one basket Not to keep your backup/restore device in the same building Never put equipment in front of a room cooling system which has a fan that is capable of blowing water across the room. Never assume that because there is no water in the data centre that water cannot find a way into the building. Procurement Having the ability to raise orders quickly. Using existing framework agreements to reduce time for procurements and European competition.
24. Lessons Learnt – Management Perspective. Operations Keep a log of all decisions and actions taken. If there is a risk, don’t delay in dealing with it. Ensure that every system is backed up.