Best Practices for Surviving Outages

•Télécharger en tant que PPTX, PDF•

1 j'aime•882 vues

Site disruptions happen, often when you least expect. When your business depends on application uptime or access to critical data, a strategy for high availability (HA) and disaster recovery (DR) is essential. Carefully considering how to architect and successfully implement an HA and DR strategy helps ensure that you minimize risk, strengthen fault tolerance, and rapidly re-deploy your application and data in case of a disruption. This presentation walks through an overview of HA and DR, and offers some best practices from the Engine Yard team. The full on-demand webcast can be viewed here: http://pages.engineyard.com/BestPracticesforSurvivingOutagesWebcast.html

Technologie Business

Best Practices for
Surviving Outages
Designing and implementing a High Availability
and Disaster Recovery strategy

Sal Cardello, Matt Dolian, Avroham Katz,
Director of System Engineer System Engineer
Pro Services

Disaster Recovery

Photo credit: naturaldisasterss.com/wp-content/uploads/2011/12/Natural-Disaster-Images.jpg 2

Tiers of Disaster Recovery

0 - No off-site data
1 - Data backup with no hot site
2 - Data backup with hot site
3 - Electronic vaulting
4 - Point-in-time copies
5 - Transaction integrity
6 - Zero or near-Zero data loss
7 - Highly automated, business
integrated solution
Citation: http://en.wikipedia.org/wiki/Seven_tiers_of_disaster_recovery 3

Definition: High Availability

“Design approach & associated service
implementation that ensures a pre-
arranged level of operational
performance will be met during a
contractual measurement period”

Citation: ttp://en.wikipedia.org/wiki/High_availability 4

Best Practices for High Availability

Environment Validate
Analysis Synchronization

Geographic
Escalation Plan
Mirroring

Database
Replication Test

Store Assets
Launch
Replication

Photo Credit: http://bit.ly/z9OEwG 7

Application Considerations

• Environment Specific Configurations

• Asset Hosting

• Page Caching

• Other Data Stores

• Background Processing

• Cron Jobs
Photo credit: http://www.flickr.com/photos/dseneste/5912382808/ 8

Failover Process at Engine Yard

Manual, customer owned decision
1. Client contacted per
terms of SLA
2. Engine Yard syncs
database and performs
manual failover
3. Redundant database
promoted to master
4. DNS is updated
5. Replication to former
master is re-established

9

Get in touch

Contact us:
Sal Cardello, Director of Pro Services
proservices@engineyard.com

Learn more:
http://www.engineyard.com/services

11

Recommandé

High Availability in 37 Easy StepsTim Serong

Engine Yard Partner Program 2014Engine Yard

Getting Started with PHP on Engine Yard CloudEngine Yard

Engine Yard Cloud Architecture EnhancementsEngine Yard

6 tips for improving ruby performanceEngine Yard

Simplifying PCI on a PaaS EnvironmentEngine Yard

The Tao of DocumentationEngine Yard

Innovate Faster in the Cloud with a Platform as a ServiceEngine Yard

Recommandé

High Availability in 37 Easy StepsTim Serong

Engine Yard Partner Program 2014Engine Yard

Getting Started with PHP on Engine Yard CloudEngine Yard

Engine Yard Cloud Architecture EnhancementsEngine Yard

6 tips for improving ruby performanceEngine Yard

Simplifying PCI on a PaaS EnvironmentEngine Yard

The Tao of DocumentationEngine Yard

Innovate Faster in the Cloud with a Platform as a ServiceEngine Yard

Introduction to RubyEngine Yard

JRuby: Enhancing Java Developers LivesEngine Yard

High Performance Ruby: Evented vs. ThreadedEngine Yard

Release Early & Release Often: Reducing Deployment FrictionEngine Yard

JRuby Jam Session Engine Yard

Rubinius and Ruby | A Love Story Engine Yard

Rails Antipatterns | Open Session with Chad Pytel Engine Yard

JRuby: Apples and OrangesEngine Yard

Developing a LanguageEngine Yard

Debugging Ruby SystemsEngine Yard

GeemusEngine Yard

Everything RubiniusEngine Yard

Rails Hosting and the WoesEngine Yard

HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics

04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous

A Year of the Servo Reboot: Where Are We Now?Igalia

Real Time Object Detection Using Open CVKhem

TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc

Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services

[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745

Contenu connexe

Plus de Engine Yard

Introduction to RubyEngine Yard

JRuby: Enhancing Java Developers LivesEngine Yard

High Performance Ruby: Evented vs. ThreadedEngine Yard

Release Early & Release Often: Reducing Deployment FrictionEngine Yard

JRuby Jam Session Engine Yard

Rubinius and Ruby | A Love Story Engine Yard

Rails Antipatterns | Open Session with Chad Pytel Engine Yard

JRuby: Apples and OrangesEngine Yard

Developing a LanguageEngine Yard

Debugging Ruby SystemsEngine Yard

GeemusEngine Yard

Everything RubiniusEngine Yard

Rails Hosting and the WoesEngine Yard

Plus de Engine Yard (13)

Introduction to Ruby

JRuby: Enhancing Java Developers Lives

High Performance Ruby: Evented vs. Threaded

Release Early & Release Often: Reducing Deployment Friction

JRuby Jam Session

Rubinius and Ruby | A Love Story

Rails Antipatterns | Open Session with Chad Pytel

JRuby: Apples and Oranges

Developing a Language

Debugging Ruby Systems

Geemus

Everything Rubinius

Rails Hosting and the Woes

Dernier

HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics

04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous

A Year of the Servo Reboot: Where Are We Now?Igalia

Real Time Object Detection Using Open CVKhem

TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc

Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services

[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745

Boost PC performance: How more available memory can improve productivityPrincipled Technologies

Scaling API-first – The story of a global engineering organizationRadu Cotescu

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93

GenAI Risks & Security Meetup 01052024.pdflior mazor

Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge

The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays

Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo

Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun

Dernier (20)

HTML Injection Attacks: Impact and Mitigation Strategies

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke

A Year of the Servo Reboot: Where Are We Now?

Real Time Object Detection Using Open CV

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery

Strategies for Landing an Oracle DBA Job as a Fresher

[2024]Digital Global Overview Report 2024 Meltwater.pdf

Boost PC performance: How more available memory can improve productivity

Scaling API-first – The story of a global engineering organization

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff

GenAI Risks & Security Meetup 01052024.pdf

Driving Behavioral Change for Information Management through Data-Driven Gree...

The 7 Things I Know About Cyber Security After 25 Years | April 2024

How to Troubleshoot Apps for the Modern Connected Worker

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe

Advantages of Hiring UIUX Design Service Providers for Your Business

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...

Powerful Google developer tools for immediate impact! (2023-24 C)

Best Practices for Surviving Outages

1. Best Practices for Surviving Outages Designing and implementing a High Availability and Disaster Recovery strategy Sal Cardello, Matt Dolian, Avroham Katz, Director of System Engineer System Engineer Pro Services

2. Disaster Recovery Photo credit: naturaldisasterss.com/wp-content/uploads/2011/12/Natural-Disaster-Images.jpg 2

3. Tiers of Disaster Recovery 0 - No off-site data 1 - Data backup with no hot site 2 - Data backup with hot site 3 - Electronic vaulting 4 - Point-in-time copies 5 - Transaction integrity 6 - Zero or near-Zero data loss 7 - Highly automated, business integrated solution Citation: http://en.wikipedia.org/wiki/Seven_tiers_of_disaster_recovery 3

4. Definition: High Availability “Design approach & associated service implementation that ensures a pre- arranged level of operational performance will be met during a contractual measurement period” Citation: ttp://en.wikipedia.org/wiki/High_availability 4

5. High Availability Architecture 5

6. Why implement HA? 6

7. Best Practices for High Availability Environment Validate Analysis Synchronization Geographic Escalation Plan Mirroring Database Replication Test Store Assets Launch Replication Photo Credit: http://bit.ly/z9OEwG 7

8. Application Considerations • Environment Specific Configurations • Asset Hosting • Page Caching • Other Data Stores • Background Processing • Cron Jobs Photo credit: http://www.flickr.com/photos/dseneste/5912382808/ 8

9. Failover Process at Engine Yard Manual, customer owned decision 1. Client contacted per terms of SLA 2. Engine Yard syncs database and performs manual failover 3. Redundant database promoted to master 4. DNS is updated 5. Replication to former master is re-established 9

10. Questions? 10

11. Get in touch Contact us: Sal Cardello, Director of Pro Services proservices@engineyard.com Learn more: http://www.engineyard.com/services 11

Notes de l'éditeur

Introduction roles and titlesMelissaSalAvrohomMatt
What is Disaster Recovery?The process, policies and procedures related to preparing for recovery or continuation of technology infrastructure critical to an organization after a natural or human-induceddisaster.
Seven Tiers to Disaster Recovery0: No off-site data – Possibly no recovery 1: Data backup with no hot site 2: Data backup with a hot site 3: Electronic vaulting 4: Point-in-time copies 5: Transaction integrity 6: Zero or near-Zero data loss 7: Highly automated, business integrated solution
High Availability is a system design approach and associated service implementation that ensures a prearranged level of operational performance will be met during a contractual measurement period.Sal to explain, Matt to cover diagram.
Avrohom to talk about complexityWhy should we implement a H/A environment.Revenue lossMore consistent up timeHigher client satisfactionBetter level of protection for critical systemsInsuranceThings to know up front about implementing a H/A environmentCostAdditional Complexity
AvrohomImplementation for High Availability systemNeeds Assessment H/A is implemented using geo-redundant systemsDatabases are kept in sync using replicationAssets are ideally stored on a storage system such as Amazon S3 but can be kept in sync using rsyncFile system synchronized between locationsCode is deployed to both systemsStack changes applied to both systemsCreate escalation flow chartBring up Secondary Site.One week test cycleFailover testLive
EnvConfigs: Stored as template in Chef Stored on filesystem and symlinked on deploy with CapistranoAsset hosting: Assets must be synced if stored locally Adds complexity and strain on resourcesPage caching: Sync page cache to prevent higher response time as cache warmsOther data stores: Dump and sync data at select intervals and during failoverBackground: Wait for jobs to finish when failing over consider where jobs are storedCron jobs:Use a gem such as whenever to automate cron jobs
Decision to failover is mutualNo automatic failoverDBA is brought in to perform manual failoverClient uptime needs are designated in client flow chartDBA promotes redundant database to masterDNS is updatedRe-establish replication to former master once back onlineDBA is brought in to check the state of the database and perform manual failoverClient uptime needs are designated in client flow chartAfter the decision to failover is made, a DBA promotes the redundant database to masterAfter a quick test of the redundant system, DNS is updatedLow TTL should be setDNS load balancing such as DynECT Managed DNS can be used to minimize downtime during IP switchRe-establish: When the former master environment is back online, configure the former master database as a read only slave