Traditional computing models, like client-server applications, have given rise to best practices for software developers as well as IT professional that deploy and support them. As these applications grew to support more features and users for mission critical workloads, their complexity and cost grew exponentially. Cloud computing offers a paradigm shift for applications to achieve greater availability and scalability at lower cost. This presentation highlights the key principles to build highly available cloud applications (SaaS) and delves deeper into the Design For Failure principle which is the key to success in the cloud.
8. • Design for Failure
• Design for Redundancy
• Monitor Extensively
• Track Dependencies
Guiding Principles
9. • Assume nothing
• Expect failures
• Anywhere and everywhere
• If it is available now, doesn’t mean it is there later
• Failures cascade
• Unhandled failures propagate
• Poorly handled failures adds complexity
• Difficulty increases exponentially with complexity
• Embrace failure, make it a first class citizen
Design For Failure
10. • Unhandled failures is a very bad idea
• Poorly handled trivial failure in one part
becomes a critical one somewhere else
• Two types of failures: Transient and Resource
• Transient failures are difficult, treat them like
Resource failures and fail fast
• Delays are transient failures, define response
time guarantees
• Failure injection is a lifestyle
Handle All Failures
11. • Eliminate single points of failure
• Architect distributed applications
• Minimize duration of statefulness
Design For Redundancy
12. • Self assess and report health
• Complementary external monitoring
• Load and latency monitoring
• Proactively restart components
Monitor Extensively
13. • Identify all dependencies
• Hardware, 3rd Party Libraries, Other servers, Network
• Infrastructure/Platform services, External services
• Your own components
• Track their health and availability
Track Dependencies
14. • If there’s only one thing you could do
• Design for Failure
• It is a paradigm shift
• It is a cultural change
• It is not easy
• It is the key to success in the cloud
Key Takeaways