Contenu connexe Similaire à AWS Summit 2011: Designing Fault Tolerant Applicatons Similaire à AWS Summit 2011: Designing Fault Tolerant Applicatons (20) Plus de Amazon Web Services Plus de Amazon Web Services (20) AWS Summit 2011: Designing Fault Tolerant Applicatons2. Building Fault-Tolerant Applications on AWS
White paper published
last year
Sharing best practices
We’d like to hear your
best practices as well
http://media.amazonwebservices.com/AWS_Building_Fault_Tolerant_Applications.pdf
Copyright © 2011 Amazon Web Services
3. AWS Fault-Tolerant Building Blocks
Two approaches:
1) AWS services that are inherently fault-tolerant and highly
available:
• Amazon Simple Storage Service (S3)
• Amazon SimpleDB
• Amazon SQS, SNS, SES, CloudWatch, CloudFront, and more.
2) AWS services that offer tools and features to design fault-
tolerant and highly available systems:
• Amazon Elastic Compute Cloud (EC2)
– Availability Zones, Elastic IPs, EBS, etc.
– Flexible to trade off budget vs. time to recovery
• Amazon Relational Database Service (RDS)
– Multi-AZ Deployments
– Backup/Restore
Copyright © 2011 Amazon Web Services
4. Amazon EC2 Architecture
Amazon Region
Machine Availability Zone
Image (AMI) Ephemeral
Storage
EC2 Instance
Elastic
CloudWatch Block
Storage
Security
Group(s)
Auto Amazon S3
Scaling Elastic IP
EBS EBS
Address Snapshot Snapshot
Load Balancing
Copyright © 2011 Amazon Web Services
5. EC2 Features
AMI
Packaged, reusable functionality
On-Instance Storage
Lifetime tied to instance lifetime
AFR like standard hard disk (around 5%)
EBS Volumes
Lifetime independent of any particular EC2 instance
Redundant within an AZ
AFR is 0.1% to 0.5%
Incorporate volume mappings into your architecture
Use EBS snapshot backups
Copyright © 2011 Amazon Web Services
6. EC2 Features
Elastic IP Addresses
Map to any EC2 instance within a given Region
Detach from failed instance; map to replacement
Auto Scaling
Two ways to use it:
• Respond to changing conditions by adding or terminating EC2
instances (attach to CloudWatch metrics)
• Maintain a fixed number of instances running, replacing them if
they fail or become unhealthy
Reserved Instances
Guarantees capacity for when it’s needed
Copyright © 2011 Amazon Web Services
8. EC2 Features
Elastic Load Balancing
Distributes incoming traffic across multiple instances
Sends traffic only to healthy instances
Copyright © 2011 Amazon Web Services
9. Amazon EC2 Regions and Availability Zones
US East (Northern Virginia) EU (Dublin)
Availability Availability
Zone A Zone B
Availability Availability
Zone A Zone B
Availability Availability
Zone C Zone D
Amazon EC2 Regions:
US East (Northern Virginia) / US West (Northern California) /
EU (Ireland) / Asia Pacific (Singapore) / Asia Pacific (Tokyo)
Copyright © 2011 Amazon Web Services
10. Availability Zone Characteristics and Advice
Distinct physical locations
Low-latency network connections between AZs
Independent power, cooling, network, security
Always partition app stacks across 2 or more AZs
Elastic Load Balance across instances in multiple AZs
Copyright © 2011 Amazon Web Services
11. Proper Use of Multiple Availability Zones
Centralized Services (S3 Backups, SimpleDB, etc)
Availability Zone A Availability Zone B
Database Server or Database Server or
RDS DB Instance RDS DB Instance
App Server App Server
Web Server Web Server
Requests and Health Checks
Elastic Load Balancer
Copyright © 2011 Amazon Web Services Incoming Requests
12. Region Characteristics and Advice
Regions are:
Functionally separate
Composed of 2 or more AZs
Connected via the public internet
Use regions to:
Have functionality geographically close to customers
Comply with national laws and practices
Implement a DR strategy
13. RDS Fault-Tolerant Features
Multi-AZ Deployments
Synchronous replication across AZs
Automatic fail-over to standby replica
Automated Backups
Enables point-in-time recovery of the DB instance
Retention period configurable
Snapshots
User initiated full backup of DB
New DB can be created from snapshots
15. Design For Failure – Basic Principles
Avoid single points of failure
Assume everything fails, and design backwards
Goal: Applications should continue to function even if the
underlying physical hardware fails or is removed or
replaced.
Design your recovery process
Trade off business needs vs. cost of high -availability
Copyright © 2011 Amazon Web Services
16. Design For Failure – Use AWS Building Blocks
Use Elastic IP addresses for consistent and re -
mappable routes
Use multiple Amazon EC2 Availability Zones (AZs )
Replicate data across multiple AZs
Example: Amazon RDS Multi-AZ mode
Use real-time monitoring (Amazon CloudWatch)
Use Amazon Elastic Block Store (EBS) for persistent
file systems
Take EBS Snapshots and use S3 for backups
Copyright © 2011 Amazon Web Services
17. Copyright ©
2011 Amazon
Web Services
Build Loosely Coupled Systems
Use independent components
Design everything as a Black Box
Load-balance and scale clusters
Think about graceful degradation
Amazon SQS as Buffers
Tight Controller Controller Controller
A B C
Coupling
Q Q Q
Loose Coupling
Controller Controller Controller
using Queues A B C
18. Implement Elasticity
Don’t assume health or fixed location of components
Use designs that are resilient to reboot and re-launch
Bootstrap your instances –
“Who am I am and what is my role?”
Enable dynamic configuration
Use configurations in SimpleDB for bootstrapping
Use Auto Scaling
Use Elastic Load Balancing on each tier
Copyright © 2011 Amazon Web Services
19. Implementing Elasticity
Elastic Load Balancing, CloudWatch, and AutoScaling
Elastic Load
Balancing
Utilization
Auto Scaling CloudWatch
Metrics
Copyright © 2011 Amazon Web Services
20. Copyright © 2011
Amazon Web
Use a Chaos Monkey Services
From the Netflix blog:
Simple monkey:
Kill any instance in the account
Complex monkey:
Kill instances with specific tags
Introduce other faults (e.g. connectivity via Security Group)
Human monkey:
Kill instances from the AWS Management Console
http://techblog.netflix.com/2010/12/5-lessons-weve-learned-using-aws.html
21. AWS Architecture Center
aws.amazon.com/architecture
White papers:
Cloud architectures
Building fault-tolerant applications
Web hosting best practices
Leveraging different storage options
AWS security best practices
Copyright © 2011 Amazon Web Services