SlideShare a Scribd company logo
1 of 15
Design For Failure Is Key
To Success In The Cloud
Ashay Chaudhary
REQUIREMENTS
Journey thru the computing models
• Mainframe         •   Reliability
• Desktop           •   Availability
• Client-Server     •   Serviceability
                    •   Performance
                           +
• Internet          • Security
                           +
• Cloud Computing   • Agility



Evolution of Requirements
AVAILABILITY
Non-Cloud Model
• Design for Non-Failure
• Deploy with Redundancy
• Manage Effectively




Guiding Principles
• Design for
  Non-Failure
  • Quality
    Hardware
• Deploy with
  Redundancy
  • Specialty
    Hardware
• Manage
  Effectively
  • Expert Staff
  • Processes
AVAILABILITY
Cloud Model
•   Design for Failure
•   Design for Redundancy
•   Monitor Extensively
•   Track Dependencies




Guiding Principles
• Assume nothing
• Expect failures
  • Anywhere and everywhere
  • If it is available now, doesn’t mean it is there later

• Failures cascade
  • Unhandled failures propagate
  • Poorly handled failures adds complexity
  • Difficulty increases exponentially with complexity


• Embrace failure, make it a first class citizen

Design For Failure
• Unhandled failures is a very bad idea
• Poorly handled trivial failure in one part
  becomes a critical one somewhere else
• Two types of failures: Transient and Resource
  • Transient failures are difficult, treat them like
    Resource failures and fail fast
  • Delays are transient failures, define response
    time guarantees
• Failure injection is a lifestyle


Handle All Failures
• Eliminate single points of failure
• Architect distributed applications
• Minimize duration of statefulness




Design For Redundancy
•   Self assess and report health
•   Complementary external monitoring
•   Load and latency monitoring
•   Proactively restart components




Monitor Extensively
• Identify all dependencies
  • Hardware, 3rd Party Libraries, Other servers, Network
  • Infrastructure/Platform services, External services
  • Your own components
• Track their health and availability




Track Dependencies
• If there’s only one thing you could do
  • Design for Failure


• It is a paradigm shift
• It is a cultural change
• It is not easy



• It is the key to success in the cloud


Key Takeaways
Ashay Chaudhary
Cloud Consultant
  Corporate Education
  Private Cloud Solutions
  Highly Scalable SaaS Applications
  SaaS Business Intelligence & Analytics




ashay@kloudpros.com
@ashay_c

More Related Content

Viewers also liked

Agile - A failure story
Agile - A failure storyAgile - A failure story
Agile - A failure storyMiki Lior
 
Cloud Native Java with Spring Cloud Services
Cloud Native Java with Spring Cloud ServicesCloud Native Java with Spring Cloud Services
Cloud Native Java with Spring Cloud ServicesVMware Tanzu
 
Breaking the Monolith
Breaking the MonolithBreaking the Monolith
Breaking the MonolithVMware Tanzu
 
Replication, Durability, and Disaster Recovery
Replication, Durability, and Disaster RecoveryReplication, Durability, and Disaster Recovery
Replication, Durability, and Disaster RecoverySteven Francia
 
Spring cloud for microservices architecture
Spring cloud for microservices architectureSpring cloud for microservices architecture
Spring cloud for microservices architectureIgor Khotin
 
Aws 201:Advanced Breakout Track on HA and DR
Aws 201:Advanced Breakout Track on HA and DRAws 201:Advanced Breakout Track on HA and DR
Aws 201:Advanced Breakout Track on HA and DRHarish Ganesan
 
Atlassian sdk (2)
Atlassian sdk (2)Atlassian sdk (2)
Atlassian sdk (2)Provectus
 
YAPC::Asia Tokyo 2011 Closing
YAPC::Asia Tokyo 2011 ClosingYAPC::Asia Tokyo 2011 Closing
YAPC::Asia Tokyo 2011 Closinglestrrat
 
реалии использования Mv в i os разработке
реалии использования Mv в i os разработкереалии использования Mv в i os разработке
реалии использования Mv в i os разработкеProvectus
 
Евгений Яремчук "Workflow of the Data Scientist Expertise in 6 Steps. Applica...
Евгений Яремчук "Workflow of the Data Scientist Expertise in 6 Steps. Applica...Евгений Яремчук "Workflow of the Data Scientist Expertise in 6 Steps. Applica...
Евгений Яремчук "Workflow of the Data Scientist Expertise in 6 Steps. Applica...Provectus
 
Pivotal Cloud Foundry: A Technical Overview
Pivotal Cloud Foundry: A Technical OverviewPivotal Cloud Foundry: A Technical Overview
Pivotal Cloud Foundry: A Technical OverviewVMware Tanzu
 
Всеволод Поляков: “Организованный DevOps”
Всеволод Поляков: “Организованный DevOps”Всеволод Поляков: “Организованный DevOps”
Всеволод Поляков: “Организованный DevOps”Provectus
 
Manual de redes. equipo 7
Manual de redes. equipo 7Manual de redes. equipo 7
Manual de redes. equipo 7luismendez4O6
 
Pronabec otorga becas para estudiar en la Escuela de Arte
Pronabec otorga becas para estudiar en la Escuela de Arte Pronabec otorga becas para estudiar en la Escuela de Arte
Pronabec otorga becas para estudiar en la Escuela de Arte Portafolio periodístico - Pp
 
YAPC::Asia Tokyo 2012 Closing
YAPC::Asia Tokyo 2012 ClosingYAPC::Asia Tokyo 2012 Closing
YAPC::Asia Tokyo 2012 Closinglestrrat
 
How To Think In Go
How To Think In GoHow To Think In Go
How To Think In Golestrrat
 

Viewers also liked (18)

Agile - A failure story
Agile - A failure storyAgile - A failure story
Agile - A failure story
 
Cloud Native Java with Spring Cloud Services
Cloud Native Java with Spring Cloud ServicesCloud Native Java with Spring Cloud Services
Cloud Native Java with Spring Cloud Services
 
Breaking the Monolith
Breaking the MonolithBreaking the Monolith
Breaking the Monolith
 
Replication, Durability, and Disaster Recovery
Replication, Durability, and Disaster RecoveryReplication, Durability, and Disaster Recovery
Replication, Durability, and Disaster Recovery
 
Spring cloud for microservices architecture
Spring cloud for microservices architectureSpring cloud for microservices architecture
Spring cloud for microservices architecture
 
Aws 201:Advanced Breakout Track on HA and DR
Aws 201:Advanced Breakout Track on HA and DRAws 201:Advanced Breakout Track on HA and DR
Aws 201:Advanced Breakout Track on HA and DR
 
Atlassian sdk (2)
Atlassian sdk (2)Atlassian sdk (2)
Atlassian sdk (2)
 
YAPC::Asia Tokyo 2011 Closing
YAPC::Asia Tokyo 2011 ClosingYAPC::Asia Tokyo 2011 Closing
YAPC::Asia Tokyo 2011 Closing
 
реалии использования Mv в i os разработке
реалии использования Mv в i os разработкереалии использования Mv в i os разработке
реалии использования Mv в i os разработке
 
Евгений Яремчук "Workflow of the Data Scientist Expertise in 6 Steps. Applica...
Евгений Яремчук "Workflow of the Data Scientist Expertise in 6 Steps. Applica...Евгений Яремчук "Workflow of the Data Scientist Expertise in 6 Steps. Applica...
Евгений Яремчук "Workflow of the Data Scientist Expertise in 6 Steps. Applica...
 
Pivotal Cloud Foundry: A Technical Overview
Pivotal Cloud Foundry: A Technical OverviewPivotal Cloud Foundry: A Technical Overview
Pivotal Cloud Foundry: A Technical Overview
 
Всеволод Поляков: “Организованный DevOps”
Всеволод Поляков: “Организованный DevOps”Всеволод Поляков: “Организованный DevOps”
Всеволод Поляков: “Организованный DevOps”
 
Manual de redes. equipo 7
Manual de redes. equipo 7Manual de redes. equipo 7
Manual de redes. equipo 7
 
CRITHINKEDU Overview (Lithuanian)
CRITHINKEDU Overview (Lithuanian)CRITHINKEDU Overview (Lithuanian)
CRITHINKEDU Overview (Lithuanian)
 
Pronabec otorga becas para estudiar en la Escuela de Arte
Pronabec otorga becas para estudiar en la Escuela de Arte Pronabec otorga becas para estudiar en la Escuela de Arte
Pronabec otorga becas para estudiar en la Escuela de Arte
 
¿La GE, como puede plantar una Iglesia?
¿La GE, como puede plantar una Iglesia?¿La GE, como puede plantar una Iglesia?
¿La GE, como puede plantar una Iglesia?
 
YAPC::Asia Tokyo 2012 Closing
YAPC::Asia Tokyo 2012 ClosingYAPC::Asia Tokyo 2012 Closing
YAPC::Asia Tokyo 2012 Closing
 
How To Think In Go
How To Think In GoHow To Think In Go
How To Think In Go
 

Recently uploaded

Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 

Recently uploaded (20)

Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 

Design For Failure Is Key To Success In The Cloud

  • 1. Design For Failure Is Key To Success In The Cloud Ashay Chaudhary
  • 2. REQUIREMENTS Journey thru the computing models
  • 3. • Mainframe • Reliability • Desktop • Availability • Client-Server • Serviceability • Performance + • Internet • Security + • Cloud Computing • Agility Evolution of Requirements
  • 5. • Design for Non-Failure • Deploy with Redundancy • Manage Effectively Guiding Principles
  • 6. • Design for Non-Failure • Quality Hardware • Deploy with Redundancy • Specialty Hardware • Manage Effectively • Expert Staff • Processes
  • 8. Design for Failure • Design for Redundancy • Monitor Extensively • Track Dependencies Guiding Principles
  • 9. • Assume nothing • Expect failures • Anywhere and everywhere • If it is available now, doesn’t mean it is there later • Failures cascade • Unhandled failures propagate • Poorly handled failures adds complexity • Difficulty increases exponentially with complexity • Embrace failure, make it a first class citizen Design For Failure
  • 10. • Unhandled failures is a very bad idea • Poorly handled trivial failure in one part becomes a critical one somewhere else • Two types of failures: Transient and Resource • Transient failures are difficult, treat them like Resource failures and fail fast • Delays are transient failures, define response time guarantees • Failure injection is a lifestyle Handle All Failures
  • 11. • Eliminate single points of failure • Architect distributed applications • Minimize duration of statefulness Design For Redundancy
  • 12. Self assess and report health • Complementary external monitoring • Load and latency monitoring • Proactively restart components Monitor Extensively
  • 13. • Identify all dependencies • Hardware, 3rd Party Libraries, Other servers, Network • Infrastructure/Platform services, External services • Your own components • Track their health and availability Track Dependencies
  • 14. • If there’s only one thing you could do • Design for Failure • It is a paradigm shift • It is a cultural change • It is not easy • It is the key to success in the cloud Key Takeaways
  • 15. Ashay Chaudhary Cloud Consultant Corporate Education Private Cloud Solutions Highly Scalable SaaS Applications SaaS Business Intelligence & Analytics ashay@kloudpros.com @ashay_c