SlideShare une entreprise Scribd logo
1  sur  12
Télécharger pour lire hors ligne
SRE Demystified
Eliminate Toil
ganesh@ganeshniyer.com
ganesh.vigneswara@gmail.com,
http://ganeshniyer.com
Dr Ganesh Neelakanta Iyer
SRE
•
2https://image.slidesharecdn.com/devopssreatgooglescale-190121123035/95/devops-sre-at-google-scale-30-638.jpg?cb=1548074257
Toil
• Toil is the kind of
work tied to running
a production service
that tends to be
manual, repetitive,
automatable,
tactical, devoid of
enduring value, and
that scales linearly
as a service grows
3https://landing.google.com/sre/workbook/chapters/eliminating-toil/
What is NOT toil?
• Toil is not just "work I don’t like to do.”
• It’s also not simply equivalent to administrative chores or
grungy work
• There are also administrative chores that have to get done,
but should not be categorized as toil: this is overhead
• It includes tasks like team meetings, setting goals and HR
paperwork
• Cleaning up the entire alerting configuration for your
service and removing clutter may be grungy, but it’s not toil
4https://landing.google.com/sre/workbook/chapters/eliminating-toil/
Toil Defined
5
Manual Repetitive Automatable Tactical
No enduring Value O(n) with service growth
Manually running a
script (time spend
running the script)
Handling pager
alerts
Toil is work you do
over and over
If a machine could
accomplish the task just
as well as a human
If your service remains in the
same state after you have
finished a task, the task was
probably toil.
If the work involved in a task scales up
linearly with service size, traffic volume, or
user count, that task is probably toil.
https://landing.google.com/sre/workbook/chapters/eliminating-toil/
Examples
• Handling quota requests
• Applying database schema changes
• Reviewing non-critical monitoring
alerts
• Copying and pasting commands
from a playbook
6
https://cloud.google.com/blog/products/management-tools/identifying-and-tracking-toil-using-sre-principles
https://www.rundeck.com/blog/sre-anti-pattern-known-workaround-bug-closed
Measuring the impact of the work
• What type of work was it (quota changes, push release to
production, ACL update, etc.)?
• What was the degree of difficulty: Easy (<1 hour);
Medium (hours); Hard (days) (based on human hands-on
time, not elapsed time)?
• Who did the work?
7
https://cloud.google.com/blog/products/management-tools/identifying-and-tracking-toil-using-sre-principles
Identifying toil: Survey the team
• Averaging over the past four weeks, approximately what fraction of your time did you spend on toil?
• Scale 0-100%
• How happy are you with the quantity of time you spend on toil?
• Not happy / OK / No problem at all
• What are your top three sources of toil?
• On-call Response / Interrupts / Pushes / Capacity / Other / etc.
• Do you have a long-term engineering project in your quarterly objectives?
• Yes / No
• If so, averaging over the past four weeks, approximately what fraction of your time did you spend on
your engineering project? (estimate)
• Scale 0-100%
• In your team, is there toil you can automate away but you don’t do so, because that very toil takes
time away from long-term engineering work? If so, please describe below.
• Open response
8
https://cloud.google.com/blog/products/management-tools/identifying-and-tracking-toil-using-sre-principles
Measuring Toil
• Regularly, compute an
estimate of how much time is
being spent on various types
of work
• Look for patterns or trends in
your tickets, surveys, and on-
call incident response, and
prioritize based on the
aggregate human time spent
9
https://www.rundeck.com/blog/sre-anti-pattern-known-workaround-bug-closed
https://cloud.google.com/blog/products/management-tools/identifying-and-tracking-toil-using-sre-principles
Eliminating Toil
• Treat your automation like any other production system
• If you have an SLO practice, use some of your error
budget to automate away toil
• Complete postmortems when your automation fails, and
fix it as you would any user-facing system
• You want your automation available to you in any
situation, including production incidents, to free humans
to do the work they’re good at
10
https://cloud.google.com/blog/products/management-tools/identifying-and-tracking-toil-using-sre-principles
References
11
Dr Ganesh Neelakanta Iyer
ganesh@ganeshniyer.com
ganesh.vigneswara@gmail.com

Contenu connexe

Tendances

Getting started with Site Reliability Engineering (SRE)
Getting started with Site Reliability Engineering (SRE)Getting started with Site Reliability Engineering (SRE)
Getting started with Site Reliability Engineering (SRE)Abeer R
 
Site reliability engineering - Lightning Talk
Site reliability engineering - Lightning TalkSite reliability engineering - Lightning Talk
Site reliability engineering - Lightning TalkMichae Blakeney
 
What is Site Reliability Engineering (SRE)
What is Site Reliability Engineering (SRE)What is Site Reliability Engineering (SRE)
What is Site Reliability Engineering (SRE)jeetendra mandal
 
SRE 101 (Site Reliability Engineering)
SRE 101 (Site Reliability Engineering)SRE 101 (Site Reliability Engineering)
SRE 101 (Site Reliability Engineering)Hussain Mansoor
 
How Small Team Get Ready for SRE (public version)
How Small Team Get Ready for SRE (public version)How Small Team Get Ready for SRE (public version)
How Small Team Get Ready for SRE (public version)Setyo Legowo
 
How to SRE when you have no SRE
How to SRE when you have no SREHow to SRE when you have no SRE
How to SRE when you have no SRESquadcast Inc
 
SRE (service reliability engineer) on big DevOps platform running on the clou...
SRE (service reliability engineer) on big DevOps platform running on the clou...SRE (service reliability engineer) on big DevOps platform running on the clou...
SRE (service reliability engineer) on big DevOps platform running on the clou...DevClub_lv
 
SRE Demystified - 04 - Engagement Model
SRE Demystified - 04 - Engagement ModelSRE Demystified - 04 - Engagement Model
SRE Demystified - 04 - Engagement ModelDr Ganesh Iyer
 
Service Level Terminology : SLA ,SLO & SLI
Service Level Terminology : SLA ,SLO & SLIService Level Terminology : SLA ,SLO & SLI
Service Level Terminology : SLA ,SLO & SLIKnoldus Inc.
 
Building an SRE Organization @ Squarespace
Building an SRE Organization @ SquarespaceBuilding an SRE Organization @ Squarespace
Building an SRE Organization @ SquarespaceFranklin Angulo
 
SRE-iously! Reliability!
SRE-iously! Reliability!SRE-iously! Reliability!
SRE-iously! Reliability!New Relic
 
Rapid Strategic SRE Assessments
Rapid Strategic SRE AssessmentsRapid Strategic SRE Assessments
Rapid Strategic SRE AssessmentsMarc Hornbeek
 
SRE Demystified - 01 - SLO SLI and SLA
SRE Demystified - 01 - SLO SLI and SLASRE Demystified - 01 - SLO SLI and SLA
SRE Demystified - 01 - SLO SLI and SLADr Ganesh Iyer
 
Reconstructing the SRE
Reconstructing the SREReconstructing the SRE
Reconstructing the SREBob Wise
 
Managing software projects & teams effectively
Managing software projects & teams effectivelyManaging software projects & teams effectively
Managing software projects & teams effectivelyAshutosh Agarwal
 

Tendances (20)

Getting started with Site Reliability Engineering (SRE)
Getting started with Site Reliability Engineering (SRE)Getting started with Site Reliability Engineering (SRE)
Getting started with Site Reliability Engineering (SRE)
 
Sre summary
Sre summarySre summary
Sre summary
 
Site reliability engineering - Lightning Talk
Site reliability engineering - Lightning TalkSite reliability engineering - Lightning Talk
Site reliability engineering - Lightning Talk
 
What is Site Reliability Engineering (SRE)
What is Site Reliability Engineering (SRE)What is Site Reliability Engineering (SRE)
What is Site Reliability Engineering (SRE)
 
SRE 101 (Site Reliability Engineering)
SRE 101 (Site Reliability Engineering)SRE 101 (Site Reliability Engineering)
SRE 101 (Site Reliability Engineering)
 
How Small Team Get Ready for SRE (public version)
How Small Team Get Ready for SRE (public version)How Small Team Get Ready for SRE (public version)
How Small Team Get Ready for SRE (public version)
 
How to SRE when you have no SRE
How to SRE when you have no SREHow to SRE when you have no SRE
How to SRE when you have no SRE
 
SRE in Startup
SRE in StartupSRE in Startup
SRE in Startup
 
SRE (service reliability engineer) on big DevOps platform running on the clou...
SRE (service reliability engineer) on big DevOps platform running on the clou...SRE (service reliability engineer) on big DevOps platform running on the clou...
SRE (service reliability engineer) on big DevOps platform running on the clou...
 
SRE Demystified - 04 - Engagement Model
SRE Demystified - 04 - Engagement ModelSRE Demystified - 04 - Engagement Model
SRE Demystified - 04 - Engagement Model
 
Service Level Terminology : SLA ,SLO & SLI
Service Level Terminology : SLA ,SLO & SLIService Level Terminology : SLA ,SLO & SLI
Service Level Terminology : SLA ,SLO & SLI
 
Building an SRE Organization @ Squarespace
Building an SRE Organization @ SquarespaceBuilding an SRE Organization @ Squarespace
Building an SRE Organization @ Squarespace
 
SRE-iously! Reliability!
SRE-iously! Reliability!SRE-iously! Reliability!
SRE-iously! Reliability!
 
Rapid Strategic SRE Assessments
Rapid Strategic SRE AssessmentsRapid Strategic SRE Assessments
Rapid Strategic SRE Assessments
 
SRE Demystified - 01 - SLO SLI and SLA
SRE Demystified - 01 - SLO SLI and SLASRE Demystified - 01 - SLO SLI and SLA
SRE Demystified - 01 - SLO SLI and SLA
 
SRE From Scratch
SRE From ScratchSRE From Scratch
SRE From Scratch
 
SRE vs DevOps
SRE vs DevOpsSRE vs DevOps
SRE vs DevOps
 
Reconstructing the SRE
Reconstructing the SREReconstructing the SRE
Reconstructing the SRE
 
SRE 101
SRE 101SRE 101
SRE 101
 
Managing software projects & teams effectively
Managing software projects & teams effectivelyManaging software projects & teams effectively
Managing software projects & teams effectively
 

Similaire à SRE Demystified - 05 - Toil Elimination

EngManagement - Lecture 7.pptx
EngManagement - Lecture 7.pptxEngManagement - Lecture 7.pptx
EngManagement - Lecture 7.pptxshayanzafar2
 
PI Boot Camp 2015.06 Participant Packet
PI Boot Camp 2015.06 Participant PacketPI Boot Camp 2015.06 Participant Packet
PI Boot Camp 2015.06 Participant PacketMike Rudolf
 
Introduction to processes and procedures
Introduction to processes and proceduresIntroduction to processes and procedures
Introduction to processes and proceduresLars Hempel Hedegaard
 
Introduction of Career Development - 2 - Copy.pptx
Introduction of Career Development - 2 - Copy.pptxIntroduction of Career Development - 2 - Copy.pptx
Introduction of Career Development - 2 - Copy.pptxerangajayasekara3
 
Time Management & Worklife Balance training course
Time Management & Worklife Balance training courseTime Management & Worklife Balance training course
Time Management & Worklife Balance training coursewulston alderman
 
Time management.pptx
Time management.pptxTime management.pptx
Time management.pptxfiweif
 
Driving Change with Data: Getting Started with Continuous Improvement
Driving Change with Data: Getting Started with Continuous ImprovementDriving Change with Data: Getting Started with Continuous Improvement
Driving Change with Data: Getting Started with Continuous ImprovementLeanKit
 
Time management.pdf
Time management.pdfTime management.pdf
Time management.pdffiweif
 
ENHANCING EFFICIENCY THROUGH MANAGEMENT OF WORKLOAD & RESOURCES.pptx
ENHANCING EFFICIENCY THROUGH MANAGEMENT OF WORKLOAD & RESOURCES.pptxENHANCING EFFICIENCY THROUGH MANAGEMENT OF WORKLOAD & RESOURCES.pptx
ENHANCING EFFICIENCY THROUGH MANAGEMENT OF WORKLOAD & RESOURCES.pptxHassaanAfzal3
 
Internal audit mechanism
Internal audit mechanismInternal audit mechanism
Internal audit mechanismSaurabh Sawhney
 
Job analysis & contengency
Job analysis & contengencyJob analysis & contengency
Job analysis & contengencyMuhammad Ali
 
Performance management
Performance managementPerformance management
Performance managementRajni Singh
 
Performance Evaluations for UIT
Performance Evaluations for UITPerformance Evaluations for UIT
Performance Evaluations for UITJerry Sheehan
 
Sue Sheerin: Why self-assessment exciting?
Sue Sheerin: Why self-assessment exciting?Sue Sheerin: Why self-assessment exciting?
Sue Sheerin: Why self-assessment exciting?eaquals
 
Performance appraisal answers examples
Performance appraisal answers examplesPerformance appraisal answers examples
Performance appraisal answers examplesbarnesali609
 
How to Design Effective PMS Systems and KRA Sheets
How to Design Effective PMS Systems and KRA SheetsHow to Design Effective PMS Systems and KRA Sheets
How to Design Effective PMS Systems and KRA SheetsAkash Deep Sharma
 
The Importance of Delegation - key ways to grow your business
The Importance of Delegation - key ways to grow your business The Importance of Delegation - key ways to grow your business
The Importance of Delegation - key ways to grow your business The Pathway Group
 

Similaire à SRE Demystified - 05 - Toil Elimination (20)

Job Analysis.pptx
Job Analysis.pptxJob Analysis.pptx
Job Analysis.pptx
 
EngManagement - Lecture 7.pptx
EngManagement - Lecture 7.pptxEngManagement - Lecture 7.pptx
EngManagement - Lecture 7.pptx
 
PI Boot Camp 2015.06 Participant Packet
PI Boot Camp 2015.06 Participant PacketPI Boot Camp 2015.06 Participant Packet
PI Boot Camp 2015.06 Participant Packet
 
Introduction to processes and procedures
Introduction to processes and proceduresIntroduction to processes and procedures
Introduction to processes and procedures
 
Introduction of Career Development - 2 - Copy.pptx
Introduction of Career Development - 2 - Copy.pptxIntroduction of Career Development - 2 - Copy.pptx
Introduction of Career Development - 2 - Copy.pptx
 
Time Management & Worklife Balance training course
Time Management & Worklife Balance training courseTime Management & Worklife Balance training course
Time Management & Worklife Balance training course
 
Time management.pptx
Time management.pptxTime management.pptx
Time management.pptx
 
Driving Change with Data: Getting Started with Continuous Improvement
Driving Change with Data: Getting Started with Continuous ImprovementDriving Change with Data: Getting Started with Continuous Improvement
Driving Change with Data: Getting Started with Continuous Improvement
 
Time management.pdf
Time management.pdfTime management.pdf
Time management.pdf
 
ENHANCING EFFICIENCY THROUGH MANAGEMENT OF WORKLOAD & RESOURCES.pptx
ENHANCING EFFICIENCY THROUGH MANAGEMENT OF WORKLOAD & RESOURCES.pptxENHANCING EFFICIENCY THROUGH MANAGEMENT OF WORKLOAD & RESOURCES.pptx
ENHANCING EFFICIENCY THROUGH MANAGEMENT OF WORKLOAD & RESOURCES.pptx
 
Internal audit mechanism
Internal audit mechanismInternal audit mechanism
Internal audit mechanism
 
Job analysis & contengency
Job analysis & contengencyJob analysis & contengency
Job analysis & contengency
 
Performance management
Performance managementPerformance management
Performance management
 
Performance Evaluations for UIT
Performance Evaluations for UITPerformance Evaluations for UIT
Performance Evaluations for UIT
 
Sue Sheerin: Why self-assessment exciting?
Sue Sheerin: Why self-assessment exciting?Sue Sheerin: Why self-assessment exciting?
Sue Sheerin: Why self-assessment exciting?
 
Performance appraisal answers examples
Performance appraisal answers examplesPerformance appraisal answers examples
Performance appraisal answers examples
 
Bullseye Benefits Flyer
Bullseye Benefits FlyerBullseye Benefits Flyer
Bullseye Benefits Flyer
 
Demystifying Evaluation
Demystifying EvaluationDemystifying Evaluation
Demystifying Evaluation
 
How to Design Effective PMS Systems and KRA Sheets
How to Design Effective PMS Systems and KRA SheetsHow to Design Effective PMS Systems and KRA Sheets
How to Design Effective PMS Systems and KRA Sheets
 
The Importance of Delegation - key ways to grow your business
The Importance of Delegation - key ways to grow your business The Importance of Delegation - key ways to grow your business
The Importance of Delegation - key ways to grow your business
 

Plus de Dr Ganesh Iyer

SRE Demystified - 16 - NALSD - Non-Abstract Large System Design
SRE Demystified - 16 - NALSD - Non-Abstract Large System DesignSRE Demystified - 16 - NALSD - Non-Abstract Large System Design
SRE Demystified - 16 - NALSD - Non-Abstract Large System DesignDr Ganesh Iyer
 
SRE Demystified - 14 - SRE Practices overview
SRE Demystified - 14 - SRE Practices overviewSRE Demystified - 14 - SRE Practices overview
SRE Demystified - 14 - SRE Practices overviewDr Ganesh Iyer
 
SRE Demystified - 13 - Docs that matter -2
SRE Demystified - 13 - Docs that matter -2SRE Demystified - 13 - Docs that matter -2
SRE Demystified - 13 - Docs that matter -2Dr Ganesh Iyer
 
SRE Demystified - 12 - Docs that matter -1
SRE Demystified - 12 - Docs that matter -1 SRE Demystified - 12 - Docs that matter -1
SRE Demystified - 12 - Docs that matter -1 Dr Ganesh Iyer
 
SRE Demystified - 11 - Release management-2
SRE Demystified - 11 - Release management-2SRE Demystified - 11 - Release management-2
SRE Demystified - 11 - Release management-2Dr Ganesh Iyer
 
SRE Demystified - 10 - Release management-1
SRE Demystified - 10 - Release management-1SRE Demystified - 10 - Release management-1
SRE Demystified - 10 - Release management-1Dr Ganesh Iyer
 
SRE Demystified - 09 - Simplicity
SRE Demystified - 09 - SimplicitySRE Demystified - 09 - Simplicity
SRE Demystified - 09 - SimplicityDr Ganesh Iyer
 
SRE Demystified - 07 - Practical Alerting
SRE Demystified - 07 - Practical AlertingSRE Demystified - 07 - Practical Alerting
SRE Demystified - 07 - Practical AlertingDr Ganesh Iyer
 
SRE Demystified - 06 - Distributed Monitoring
SRE Demystified - 06 - Distributed MonitoringSRE Demystified - 06 - Distributed Monitoring
SRE Demystified - 06 - Distributed MonitoringDr Ganesh Iyer
 
SRE Demystified - 03 - Choosing SLIs and SLOs
SRE Demystified - 03 - Choosing SLIs and SLOsSRE Demystified - 03 - Choosing SLIs and SLOs
SRE Demystified - 03 - Choosing SLIs and SLOsDr Ganesh Iyer
 
Machine Learning for Statisticians - Introduction
Machine Learning for Statisticians - IntroductionMachine Learning for Statisticians - Introduction
Machine Learning for Statisticians - IntroductionDr Ganesh Iyer
 
Making Decisions - A Game Theoretic approach
Making Decisions - A Game Theoretic approachMaking Decisions - A Game Theoretic approach
Making Decisions - A Game Theoretic approachDr Ganesh Iyer
 
Game Theory and Engineering Applications
Game Theory and Engineering ApplicationsGame Theory and Engineering Applications
Game Theory and Engineering ApplicationsDr Ganesh Iyer
 
Machine Learning and its Applications
Machine Learning and its ApplicationsMachine Learning and its Applications
Machine Learning and its ApplicationsDr Ganesh Iyer
 
How to become a successful entrepreneur
How to become a successful entrepreneurHow to become a successful entrepreneur
How to become a successful entrepreneurDr Ganesh Iyer
 
Dockers and kubernetes
Dockers and kubernetesDockers and kubernetes
Dockers and kubernetesDr Ganesh Iyer
 
Containerization Principles Overview for app development and deployment
Containerization Principles Overview for app development and deploymentContainerization Principles Overview for app development and deployment
Containerization Principles Overview for app development and deploymentDr Ganesh Iyer
 
Game Theory and Engineering Applications
Game Theory and Engineering ApplicationsGame Theory and Engineering Applications
Game Theory and Engineering ApplicationsDr Ganesh Iyer
 
Demystifying Containerization Principles for Data Scientists
Demystifying Containerization Principles for Data ScientistsDemystifying Containerization Principles for Data Scientists
Demystifying Containerization Principles for Data ScientistsDr Ganesh Iyer
 

Plus de Dr Ganesh Iyer (20)

SRE Demystified - 16 - NALSD - Non-Abstract Large System Design
SRE Demystified - 16 - NALSD - Non-Abstract Large System DesignSRE Demystified - 16 - NALSD - Non-Abstract Large System Design
SRE Demystified - 16 - NALSD - Non-Abstract Large System Design
 
SRE Demystified - 14 - SRE Practices overview
SRE Demystified - 14 - SRE Practices overviewSRE Demystified - 14 - SRE Practices overview
SRE Demystified - 14 - SRE Practices overview
 
SRE Demystified - 13 - Docs that matter -2
SRE Demystified - 13 - Docs that matter -2SRE Demystified - 13 - Docs that matter -2
SRE Demystified - 13 - Docs that matter -2
 
SRE Demystified - 12 - Docs that matter -1
SRE Demystified - 12 - Docs that matter -1 SRE Demystified - 12 - Docs that matter -1
SRE Demystified - 12 - Docs that matter -1
 
SRE Demystified - 11 - Release management-2
SRE Demystified - 11 - Release management-2SRE Demystified - 11 - Release management-2
SRE Demystified - 11 - Release management-2
 
SRE Demystified - 10 - Release management-1
SRE Demystified - 10 - Release management-1SRE Demystified - 10 - Release management-1
SRE Demystified - 10 - Release management-1
 
SRE Demystified - 09 - Simplicity
SRE Demystified - 09 - SimplicitySRE Demystified - 09 - Simplicity
SRE Demystified - 09 - Simplicity
 
SRE Demystified - 07 - Practical Alerting
SRE Demystified - 07 - Practical AlertingSRE Demystified - 07 - Practical Alerting
SRE Demystified - 07 - Practical Alerting
 
SRE Demystified - 06 - Distributed Monitoring
SRE Demystified - 06 - Distributed MonitoringSRE Demystified - 06 - Distributed Monitoring
SRE Demystified - 06 - Distributed Monitoring
 
SRE Demystified - 03 - Choosing SLIs and SLOs
SRE Demystified - 03 - Choosing SLIs and SLOsSRE Demystified - 03 - Choosing SLIs and SLOs
SRE Demystified - 03 - Choosing SLIs and SLOs
 
Machine Learning for Statisticians - Introduction
Machine Learning for Statisticians - IntroductionMachine Learning for Statisticians - Introduction
Machine Learning for Statisticians - Introduction
 
Making Decisions - A Game Theoretic approach
Making Decisions - A Game Theoretic approachMaking Decisions - A Game Theoretic approach
Making Decisions - A Game Theoretic approach
 
Cloud and Industry4.0
Cloud and Industry4.0Cloud and Industry4.0
Cloud and Industry4.0
 
Game Theory and Engineering Applications
Game Theory and Engineering ApplicationsGame Theory and Engineering Applications
Game Theory and Engineering Applications
 
Machine Learning and its Applications
Machine Learning and its ApplicationsMachine Learning and its Applications
Machine Learning and its Applications
 
How to become a successful entrepreneur
How to become a successful entrepreneurHow to become a successful entrepreneur
How to become a successful entrepreneur
 
Dockers and kubernetes
Dockers and kubernetesDockers and kubernetes
Dockers and kubernetes
 
Containerization Principles Overview for app development and deployment
Containerization Principles Overview for app development and deploymentContainerization Principles Overview for app development and deployment
Containerization Principles Overview for app development and deployment
 
Game Theory and Engineering Applications
Game Theory and Engineering ApplicationsGame Theory and Engineering Applications
Game Theory and Engineering Applications
 
Demystifying Containerization Principles for Data Scientists
Demystifying Containerization Principles for Data ScientistsDemystifying Containerization Principles for Data Scientists
Demystifying Containerization Principles for Data Scientists
 

Dernier

Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 

Dernier (20)

Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 

SRE Demystified - 05 - Toil Elimination

  • 3. Toil • Toil is the kind of work tied to running a production service that tends to be manual, repetitive, automatable, tactical, devoid of enduring value, and that scales linearly as a service grows 3https://landing.google.com/sre/workbook/chapters/eliminating-toil/
  • 4. What is NOT toil? • Toil is not just "work I don’t like to do.” • It’s also not simply equivalent to administrative chores or grungy work • There are also administrative chores that have to get done, but should not be categorized as toil: this is overhead • It includes tasks like team meetings, setting goals and HR paperwork • Cleaning up the entire alerting configuration for your service and removing clutter may be grungy, but it’s not toil 4https://landing.google.com/sre/workbook/chapters/eliminating-toil/
  • 5. Toil Defined 5 Manual Repetitive Automatable Tactical No enduring Value O(n) with service growth Manually running a script (time spend running the script) Handling pager alerts Toil is work you do over and over If a machine could accomplish the task just as well as a human If your service remains in the same state after you have finished a task, the task was probably toil. If the work involved in a task scales up linearly with service size, traffic volume, or user count, that task is probably toil. https://landing.google.com/sre/workbook/chapters/eliminating-toil/
  • 6. Examples • Handling quota requests • Applying database schema changes • Reviewing non-critical monitoring alerts • Copying and pasting commands from a playbook 6 https://cloud.google.com/blog/products/management-tools/identifying-and-tracking-toil-using-sre-principles https://www.rundeck.com/blog/sre-anti-pattern-known-workaround-bug-closed
  • 7. Measuring the impact of the work • What type of work was it (quota changes, push release to production, ACL update, etc.)? • What was the degree of difficulty: Easy (<1 hour); Medium (hours); Hard (days) (based on human hands-on time, not elapsed time)? • Who did the work? 7 https://cloud.google.com/blog/products/management-tools/identifying-and-tracking-toil-using-sre-principles
  • 8. Identifying toil: Survey the team • Averaging over the past four weeks, approximately what fraction of your time did you spend on toil? • Scale 0-100% • How happy are you with the quantity of time you spend on toil? • Not happy / OK / No problem at all • What are your top three sources of toil? • On-call Response / Interrupts / Pushes / Capacity / Other / etc. • Do you have a long-term engineering project in your quarterly objectives? • Yes / No • If so, averaging over the past four weeks, approximately what fraction of your time did you spend on your engineering project? (estimate) • Scale 0-100% • In your team, is there toil you can automate away but you don’t do so, because that very toil takes time away from long-term engineering work? If so, please describe below. • Open response 8 https://cloud.google.com/blog/products/management-tools/identifying-and-tracking-toil-using-sre-principles
  • 9. Measuring Toil • Regularly, compute an estimate of how much time is being spent on various types of work • Look for patterns or trends in your tickets, surveys, and on- call incident response, and prioritize based on the aggregate human time spent 9 https://www.rundeck.com/blog/sre-anti-pattern-known-workaround-bug-closed https://cloud.google.com/blog/products/management-tools/identifying-and-tracking-toil-using-sre-principles
  • 10. Eliminating Toil • Treat your automation like any other production system • If you have an SLO practice, use some of your error budget to automate away toil • Complete postmortems when your automation fails, and fix it as you would any user-facing system • You want your automation available to you in any situation, including production incidents, to free humans to do the work they’re good at 10 https://cloud.google.com/blog/products/management-tools/identifying-and-tracking-toil-using-sre-principles
  • 12. Dr Ganesh Neelakanta Iyer ganesh@ganeshniyer.com ganesh.vigneswara@gmail.com