SlideShare une entreprise Scribd logo
1  sur  58
Build Confidence in your
System with Chaos
Engineering
Try, Learn, Adapt
2017 - Sylvain Hellegouarch - http://chaosiq.io CHAOSIQ
Where we started...
To build this
What we tend to actually build
The environment matters... not?
Well, yes it does!
But, I’m prepared!
Even prepared, things may go wrong...
Kudos for
the report!
Everything changes and nothing stands still
Heraclitus - 402BC
Non-functional is a force towards change
It’s fine, we
have
microservices
Adapt
Things are fine… so far
As soon as you are launched you are at
the mercy of your environment
Environment is full of unknowns
"It's just one of these cases where Mars is going to give us
a new deal, and we're going to have to play the cards we
get, not the ones we want”
Jim Erickson / Project Manager at Nasa for Mars Rovers missions
Strategies?
Robustness
Resist to change
Redundancy
Duplicate Components
Adaptation
Dynamic Response to Environment
One might rephrase “calculation and
correction of error”
One might rephrase “calculation and
correction of error” as “recognition of
and response to difference”.Jeff Sussna / Designing Delivery: Rethinking IT in the Digital Service Economy
Your System is dynamic and fluid
Recognition
Environment is not stable
Recognition
Users do not care
Recognition
Accept you do not control everything
Recognition
Double-learning loop
Response
Be creative when you explore but seek
evidence
Response
Learn
What to learn?
Look for deviation from normal
Context sensitive
Strategies to learn
It’s not about breaking stuff you fool!
Open Ended Questions
Chaos Engineering
Probe your system for data
Chaos Engineering
Chaos Engineering is about trying
controlled change to observe system
availability deviation
Game Days
Planned experiments of realistic events
Collective Effort
Shared Understanding
Communication is key
No surprises
Generate a Playbook
Automation helps confidence
Observability is Key
Collect Metrics
Never enough logging
Be careful when interpreting data
Be Creative
You can't always get what you want
But if you try sometimes you might find
The Rolling Stones
Try
chaostoolkit.org
Chaos as a Code
What normal looks like? Your steady state
{
"probes": {
"steady": {
"title": "All services must be healthy before we begin",
"layer": "application",
"type": "python",
"module": "chaosk8s.probes",
"func": "all_microservices_healthy"
}
}
}
Add sources of information with probes
"probes": {
"close": {
"title": "Fetch the CPU usage for our service",
"layer": "application",
"type": "python",
"module": "chaosprometheus.probes",
"func": "query",
"arguments": {
"query": "process_cpu_seconds_total{job='websvc'}",
"when": "2 minutes ago"
}
}
}
Set the condition for change in normality
"action": {
"title": "Let's max out the CPU of a node",
"layer": "application",
"type": "python",
"module": "chaosgremlin.actions",
"func": "attack",
"secrets": "gremlin",
"arguments": {
"command": {
"type": "cpu"
},
"target": {
"type": "Random"
}
}
}
Before learning
$ chaos run experiment.json
[2017-10-06 17:37:33 INFO] Running experiment: System is resilient to provider's failures
[2017-10-06 17:37:33 INFO] Observing steady state: All services must be healthy before we begin
[2017-10-06 17:37:33 INFO] Steady State succeeded
[2017-10-06 17:37:33 INFO] Observing steady state: Before we kill it, our microservice should be alive
[2017-10-06 17:37:33 INFO] Steady State succeeded
[2017-10-06 17:37:33 INFO] Observing action: Let's stop our provider
[2017-10-06 17:37:33 INFO] Action succeeded
[2017-10-06 17:37:33 INFO] Observing close state: All services must be healthy before we begin
[2017-10-06 17:37:33 INFO] Close State succeeded
[2017-10-06 17:37:33 INFO] Observing steady state: Consumer should respond as if nothing
[2017-10-06 17:37:44 ERROR] Steady State failed: {"timestamp":1507304264100,"status":500,"error":"Internal
Server Error","exception":"feign.RetryableException","message":"connect timed out executing GET http://my-
provider-service:8080/","path":"/invokeConsumedService"}
[2017-10-06 17:37:44 INFO] Experiment is now complete
Respond to the non-functional force of
change
Do not merely correct the error
Adaptation
$ chaos run experiment.json
[2017-10-06 17:40:25 INFO] Running experiment: System is resilient to provider's failures
[2017-10-06 17:40:25 INFO] Observing steady state: All services must be healthy before we begin
[2017-10-06 17:40:25 INFO] Steady State succeeded
[2017-10-06 17:40:25 INFO] Observing steady state: Before we kill it, our microservice should be alive
[2017-10-06 17:40:26 INFO] Steady State succeeded
[2017-10-06 17:40:26 INFO] Observing action: Let's stop our provider
[2017-10-06 17:40:26 INFO] Action succeeded
[2017-10-06 17:40:26 INFO] Observing close state: All services must be healthy before we begin
[2017-10-06 17:40:26 INFO] Close State succeeded
[2017-10-06 17:40:26 INFO] Observing steady state: Consumer should respond as if nothing
[2017-10-06 17:40:30 INFO] Steady State succeeded
[2017-10-06 17:40:30 INFO] Experiment is now complete
CHAOSIQ
Chaos Engineering for Cloud Native
SaaS & On-Premises
Thank You
Sylvain Hellegouarch
CTO ChaosIQ
@lawouach / sylvain@chaosiq.io

Contenu connexe

Tendances

Tendances (11)

Craft 2019 - Security Chaos Engineering - Security Precognition
Craft 2019 - Security Chaos Engineering - Security PrecognitionCraft 2019 - Security Chaos Engineering - Security Precognition
Craft 2019 - Security Chaos Engineering - Security Precognition
 
Infrastructure automation-in-the-cloud-130613045624-phpapp02
Infrastructure automation-in-the-cloud-130613045624-phpapp02Infrastructure automation-in-the-cloud-130613045624-phpapp02
Infrastructure automation-in-the-cloud-130613045624-phpapp02
 
Chaos engineering & Gameday on AWS
Chaos engineering & Gameday on AWSChaos engineering & Gameday on AWS
Chaos engineering & Gameday on AWS
 
What DevOps Isn't
What DevOps Isn'tWhat DevOps Isn't
What DevOps Isn't
 
GameDay - Achieving resilience through Chaos Engineering
GameDay - Achieving resilience through Chaos EngineeringGameDay - Achieving resilience through Chaos Engineering
GameDay - Achieving resilience through Chaos Engineering
 
DevOpsDays PGH: How to Fail With One Weird Trick
DevOpsDays PGH:  How to Fail With One Weird TrickDevOpsDays PGH:  How to Fail With One Weird Trick
DevOpsDays PGH: How to Fail With One Weird Trick
 
AllDayDevOps : DevSecOps & Chaos Engineering: Knowing the Unknown
AllDayDevOps : DevSecOps & Chaos Engineering: Knowing the UnknownAllDayDevOps : DevSecOps & Chaos Engineering: Knowing the Unknown
AllDayDevOps : DevSecOps & Chaos Engineering: Knowing the Unknown
 
Adventures in a Microservice world at REA Group
Adventures in a Microservice world at REA GroupAdventures in a Microservice world at REA Group
Adventures in a Microservice world at REA Group
 
Why We Can't Have Nice Things, A Tale of Woe and a Hope For the Future
Why We Can't Have Nice Things, A Tale of Woe and a Hope For the FutureWhy We Can't Have Nice Things, A Tale of Woe and a Hope For the Future
Why We Can't Have Nice Things, A Tale of Woe and a Hope For the Future
 
Business Continuity for Humans: Keeping Your Business Running When Your Peopl...
Business Continuity for Humans: Keeping Your Business Running When Your Peopl...Business Continuity for Humans: Keeping Your Business Running When Your Peopl...
Business Continuity for Humans: Keeping Your Business Running When Your Peopl...
 
Everyone has a plan until... Automacon16
Everyone has a plan until...  Automacon16Everyone has a plan until...  Automacon16
Everyone has a plan until... Automacon16
 

Similaire à muCon 2017 - Build Confidence in your System with Chaos Engineering

Dr Elephant: LinkedIn's Self-Service System for Detecting and Treating Hadoop...
Dr Elephant: LinkedIn's Self-Service System for Detecting and Treating Hadoop...Dr Elephant: LinkedIn's Self-Service System for Detecting and Treating Hadoop...
Dr Elephant: LinkedIn's Self-Service System for Detecting and Treating Hadoop...
DataWorks Summit
 

Similaire à muCon 2017 - Build Confidence in your System with Chaos Engineering (20)

PyData 2015 Keynote: "A Systems View of Machine Learning"
PyData 2015 Keynote: "A Systems View of Machine Learning" PyData 2015 Keynote: "A Systems View of Machine Learning"
PyData 2015 Keynote: "A Systems View of Machine Learning"
 
Deploy and Destroy: Testing Environments - Michael Arenzon - DevOpsDays Tel A...
Deploy and Destroy: Testing Environments - Michael Arenzon - DevOpsDays Tel A...Deploy and Destroy: Testing Environments - Michael Arenzon - DevOpsDays Tel A...
Deploy and Destroy: Testing Environments - Michael Arenzon - DevOpsDays Tel A...
 
The servicescore card - Gamifying Operational Excellence - SRECON
The servicescore card - Gamifying Operational Excellence - SRECONThe servicescore card - Gamifying Operational Excellence - SRECON
The servicescore card - Gamifying Operational Excellence - SRECON
 
Albert Witteveen - With Cloud Computing Who Needs Performance Testing
Albert Witteveen - With Cloud Computing Who Needs Performance TestingAlbert Witteveen - With Cloud Computing Who Needs Performance Testing
Albert Witteveen - With Cloud Computing Who Needs Performance Testing
 
Practical Chaos Engineering
Practical Chaos EngineeringPractical Chaos Engineering
Practical Chaos Engineering
 
Continuous Delivery: The Dirty Details
Continuous Delivery: The Dirty DetailsContinuous Delivery: The Dirty Details
Continuous Delivery: The Dirty Details
 
Agile Data Science 2.0: Using Spark with MongoDB
Agile Data Science 2.0: Using Spark with MongoDBAgile Data Science 2.0: Using Spark with MongoDB
Agile Data Science 2.0: Using Spark with MongoDB
 
Dr Elephant: LinkedIn's Self-Service System for Detecting and Treating Hadoop...
Dr Elephant: LinkedIn's Self-Service System for Detecting and Treating Hadoop...Dr Elephant: LinkedIn's Self-Service System for Detecting and Treating Hadoop...
Dr Elephant: LinkedIn's Self-Service System for Detecting and Treating Hadoop...
 
AI Development with H2O.ai
AI Development with H2O.aiAI Development with H2O.ai
AI Development with H2O.ai
 
Building Reactive Applications With Akka And Java
Building Reactive Applications With Akka And JavaBuilding Reactive Applications With Akka And Java
Building Reactive Applications With Akka And Java
 
Engineering Velocity @indeed eng presented on Sept 24 2014 at Beyond Agile
Engineering Velocity @indeed eng presented on Sept 24 2014 at Beyond AgileEngineering Velocity @indeed eng presented on Sept 24 2014 at Beyond Agile
Engineering Velocity @indeed eng presented on Sept 24 2014 at Beyond Agile
 
From Duke of DevOps to Queen of Chaos - Api days 2018
From Duke of DevOps to Queen of Chaos - Api days 2018From Duke of DevOps to Queen of Chaos - Api days 2018
From Duke of DevOps to Queen of Chaos - Api days 2018
 
Agile Data Science 2.0
Agile Data Science 2.0Agile Data Science 2.0
Agile Data Science 2.0
 
From Monoliths to Microservices at Realestate.com.au
From Monoliths to Microservices at Realestate.com.auFrom Monoliths to Microservices at Realestate.com.au
From Monoliths to Microservices at Realestate.com.au
 
DOES14 - Scott Prugh - CSG - DevOps and Lean in Legacy Environments
DOES14 - Scott Prugh - CSG - DevOps and Lean in Legacy EnvironmentsDOES14 - Scott Prugh - CSG - DevOps and Lean in Legacy Environments
DOES14 - Scott Prugh - CSG - DevOps and Lean in Legacy Environments
 
WinOps Conf 2016 - Michael Greene - Release Pipelines
WinOps Conf 2016 - Michael Greene - Release PipelinesWinOps Conf 2016 - Michael Greene - Release Pipelines
WinOps Conf 2016 - Michael Greene - Release Pipelines
 
Application Metrics - IPC2023
Application Metrics - IPC2023Application Metrics - IPC2023
Application Metrics - IPC2023
 
White Paper On ConCurrency For PCMS Application Architecture
White Paper On ConCurrency For PCMS Application ArchitectureWhite Paper On ConCurrency For PCMS Application Architecture
White Paper On ConCurrency For PCMS Application Architecture
 
Creating Fault Tolerant Services on Mesos
Creating Fault Tolerant Services on MesosCreating Fault Tolerant Services on Mesos
Creating Fault Tolerant Services on Mesos
 
DOES14: Scott Prugh, CSG - DevOps and Lean in Legacy Environments
DOES14: Scott Prugh, CSG - DevOps and Lean in Legacy EnvironmentsDOES14: Scott Prugh, CSG - DevOps and Lean in Legacy Environments
DOES14: Scott Prugh, CSG - DevOps and Lean in Legacy Environments
 

Dernier

Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
masabamasaba
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
masabamasaba
 
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
masabamasaba
 

Dernier (20)

%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the Situation
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
 
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
 
WSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security Program
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
 
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
 
WSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - KeynoteWSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - Keynote
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 

muCon 2017 - Build Confidence in your System with Chaos Engineering