SlideShare a Scribd company logo
1 of 29
Helping operations top-heavy
teams the smart way
Jeff Weiner
Chief Executive Officer
Michael Kehoe
Staff Site Reliability Engineer
Todd Palino
Sr Staff Site Reliability Engineer
This Is The Only Slide You May Need a Picture Of
slideshare.net/ToddPalino slideshare.net/MichaelKehoe3
Michael Kehoe
$ WHOAMI
• Staff Site Reliability Engineer @ LinkedIn
• Production-SRE Team
• Former Network Engineer at the
University of Queensland
Todd Palino
$ WHOAMI
• Senior Staff SRE @ LinkedIn
• Capacity Engineering Team
• Co-Author of Kafka: The Definitive Guide
• Late of VeriSign Infrastructure
Engineering
When Operations Isn’t Perfect
Code Yellow
https://devops.com/code-yellow-when-operations-isnt-perfect/
• How to quickly erase all your
technical debt
• How to change your engineering
culture
This talk is not
• How to identify team anti-patterns
• How to work through high toil
• How to create sustainable
workloads
This talk is
Today’s
agenda
1 Background
2 Scenario 1: Traffic-SRE
3 Scenario 2: Kafka-SRE
4 Building A Formula For Success
5 Key Learnings
6 Q&A
Background
Personal Experience in the past two years
ASSISTANCE RENDERED
• Traffic-SRE: Technical Debt/ Resource
Allocation
• Voyager-SRE: Technical Debt
• Capacity War-room
• Espresso-SRE: Reliability
• Kafka-SRE: Capacity and Alert Fatigue
Scenario 1: Traffic-SRE
Problem Statement
Technical Debt
• Written documentation needed
improvement
• Deployment infrastructure needed
investment
• Alert Fatigue
Traffic-SRE
Problem Statement
Resource Allocations
• Backlog of work for clients
• Staff shortage
Scenario 2: Kafka
Problem Statement
Capacity Planning
• Multi-tenant Infrastructure
• No resource controls
• Unclear resource ownership
• Ad-hoc capacity planning
• Sudden 100% increase in traffic
Problem Statement
Alert Fatigue
• Multiple applications overutilized
• No time for proactive work
• Most alerts non-actionable
Building a formula for
success
Code Yellow
Building a formula for success
Define the areas
that need attacking
Problem Statement
Communicate
expectations with
clients & partners
Communication &
Partnerships
Define success
criteria
Exit Criteria
Get the help that
you require
Resource
Acquisition
Plan for short-term
& long-term
Planning
Define the areas that need attacking
Problem Statement
• Admit there is a problem
• Measure the problem
• Understand the problem
• Determines underlying causes that
need to be fixed
Building a formula for success
Define success criteria
Exit Criteria
• Define concrete goals
• Define concrete success criteria
• Measure via an operational metric
• Measure via a project being
completed
• Define timelines for completion
Building a formula for success
Get the help you require
Resource Acquisition
• Ask other teams for help
• Get dedicated engineers/ project
managers/ other roles as required
• Set exit-date for resources
Building a formula for success
Plan for the short-term & long-term
Planning
• Plan out short-term work
• Plan out longer-term projects
• Do they need to be rescheduled?
• Prioritize work that will reduce toil &
burnout (Automation +
Measurement)
Building a formula for success
Communicate expectations with
clients & partners
Communicatio
n &
Partnerships
• Communicate problem statement &
exit criteria
• Send regular progress updates
• Ensure that stakeholders
understand delays & expected
outcomes
Building a formula for success
Key Learnings
Key Learnings
Measure toil/
overhead
Measure
Prioritize efforts to
remove overhead/toil
Prioritize
Communicate with
partners & teams
Communicate
Q&A
Code Yellow: Helping operations top-heavy teams the smart way

More Related Content

What's hot

Agile Project Development
Agile Project DevelopmentAgile Project Development
Agile Project Development
Hajrah Jahan
 
Introducing Agile to the Enterprise
Introducing Agile to the EnterpriseIntroducing Agile to the Enterprise
Introducing Agile to the Enterprise
Gibraltar Software
 
Implement Agile Practices That Work
Implement Agile Practices That WorkImplement Agile Practices That Work
Implement Agile Practices That Work
Info-Tech Research Group
 

What's hot (19)

How to Thrive in a Fast-Paced Environment - The Art of Quarterly Strategic Pl...
How to Thrive in a Fast-Paced Environment - The Art of Quarterly Strategic Pl...How to Thrive in a Fast-Paced Environment - The Art of Quarterly Strategic Pl...
How to Thrive in a Fast-Paced Environment - The Art of Quarterly Strategic Pl...
 
User Story Cycle Time - An Universal Agile Maturity Measurement
User Story Cycle Time - An Universal Agile Maturity MeasurementUser Story Cycle Time - An Universal Agile Maturity Measurement
User Story Cycle Time - An Universal Agile Maturity Measurement
 
Automate estimates, resource loading , and sprint plans!
Automate estimates, resource loading , and sprint plans! Automate estimates, resource loading , and sprint plans!
Automate estimates, resource loading , and sprint plans!
 
Agile Project Development
Agile Project DevelopmentAgile Project Development
Agile Project Development
 
Optimize Portfolio Performance with Simple Agile Techniques and Jira - Part 1...
Optimize Portfolio Performance with Simple Agile Techniques and Jira - Part 1...Optimize Portfolio Performance with Simple Agile Techniques and Jira - Part 1...
Optimize Portfolio Performance with Simple Agile Techniques and Jira - Part 1...
 
Pm training day 5
Pm training   day 5Pm training   day 5
Pm training day 5
 
Introducing Agile to the Enterprise
Introducing Agile to the EnterpriseIntroducing Agile to the Enterprise
Introducing Agile to the Enterprise
 
Understanding the Relationship between Lean, Agile, and DevOps: Jon's Slides
Understanding the Relationship between Lean, Agile, and DevOps: Jon's SlidesUnderstanding the Relationship between Lean, Agile, and DevOps: Jon's Slides
Understanding the Relationship between Lean, Agile, and DevOps: Jon's Slides
 
Go Live is Just the Start - Managing AX Improvement Projects | Carlo DiPucchio
Go Live is Just the Start - Managing AX Improvement Projects | Carlo DiPucchioGo Live is Just the Start - Managing AX Improvement Projects | Carlo DiPucchio
Go Live is Just the Start - Managing AX Improvement Projects | Carlo DiPucchio
 
Software Advice UserView: Agile Project Management Report 2015
Software Advice UserView: Agile Project Management Report 2015Software Advice UserView: Agile Project Management Report 2015
Software Advice UserView: Agile Project Management Report 2015
 
Effective engineer
Effective engineerEffective engineer
Effective engineer
 
Lean Software Development
Lean Software DevelopmentLean Software Development
Lean Software Development
 
Effective Scrum
Effective ScrumEffective Scrum
Effective Scrum
 
Implement Agile Practices That Work
Implement Agile Practices That WorkImplement Agile Practices That Work
Implement Agile Practices That Work
 
Project management tips and trick
Project management tips and trickProject management tips and trick
Project management tips and trick
 
The Agile Project Portfolio - A 'Pecha Kucha' presentation
The Agile Project Portfolio - A 'Pecha Kucha' presentationThe Agile Project Portfolio - A 'Pecha Kucha' presentation
The Agile Project Portfolio - A 'Pecha Kucha' presentation
 
LSCTIG 2015 Session Materials - Are you agile
LSCTIG 2015 Session Materials - Are you agile LSCTIG 2015 Session Materials - Are you agile
LSCTIG 2015 Session Materials - Are you agile
 
DevOps By The Numbers
DevOps By The NumbersDevOps By The Numbers
DevOps By The Numbers
 
Oana Feidi - Debugging - Root cause analysis - CodeCamp-10-may-2014
Oana Feidi - Debugging - Root cause analysis - CodeCamp-10-may-2014Oana Feidi - Debugging - Root cause analysis - CodeCamp-10-may-2014
Oana Feidi - Debugging - Root cause analysis - CodeCamp-10-may-2014
 

Similar to Code Yellow: Helping operations top-heavy teams the smart way

103240-The-New-Way-of-Thinking-Our-Implementation-experience-with-Oracle-HCM-...
103240-The-New-Way-of-Thinking-Our-Implementation-experience-with-Oracle-HCM-...103240-The-New-Way-of-Thinking-Our-Implementation-experience-with-Oracle-HCM-...
103240-The-New-Way-of-Thinking-Our-Implementation-experience-with-Oracle-HCM-...
ssuser835d1a
 
Doing It On Your Own: When to Call in the Consultants, When to Leave Them Out
Doing It On Your Own: When to Call in the Consultants, When to Leave Them OutDoing It On Your Own: When to Call in the Consultants, When to Leave Them Out
Doing It On Your Own: When to Call in the Consultants, When to Leave Them Out
NTEN
 
Projects2016_Franks_Top10ReasonsProjectsFail
Projects2016_Franks_Top10ReasonsProjectsFailProjects2016_Franks_Top10ReasonsProjectsFail
Projects2016_Franks_Top10ReasonsProjectsFail
Barbara Franks
 
Success recipe for new IT projects-Agile way. Fail Fast, Fail Early
Success recipe for new IT projects-Agile way. Fail Fast, Fail EarlySuccess recipe for new IT projects-Agile way. Fail Fast, Fail Early
Success recipe for new IT projects-Agile way. Fail Fast, Fail Early
Joseph Vargheese PMP CSM CSP
 
Tackling the Fallacy of Agile
Tackling the Fallacy of Agile Tackling the Fallacy of Agile
Tackling the Fallacy of Agile
BSGAfrica
 

Similar to Code Yellow: Helping operations top-heavy teams the smart way (20)

Code Yellow: Helping Operations Top-Heavy Teams the Smart Way
Code Yellow: Helping Operations Top-Heavy Teams the Smart WayCode Yellow: Helping Operations Top-Heavy Teams the Smart Way
Code Yellow: Helping Operations Top-Heavy Teams the Smart Way
 
Helping operations top-heavy teams the smart way
Helping operations top-heavy teams the smart wayHelping operations top-heavy teams the smart way
Helping operations top-heavy teams the smart way
 
Applying both of waterfall and iterative development
Applying both of waterfall and iterative developmentApplying both of waterfall and iterative development
Applying both of waterfall and iterative development
 
American Electric Power Ercot kickoff
American Electric Power Ercot kickoffAmerican Electric Power Ercot kickoff
American Electric Power Ercot kickoff
 
BoS2015 Jeff Szczepanski – COO, Stack Exchange - Stack Overflow. Scaling a Te...
BoS2015 Jeff Szczepanski – COO, Stack Exchange - Stack Overflow. Scaling a Te...BoS2015 Jeff Szczepanski – COO, Stack Exchange - Stack Overflow. Scaling a Te...
BoS2015 Jeff Szczepanski – COO, Stack Exchange - Stack Overflow. Scaling a Te...
 
Pm training day 3
Pm training   day 3Pm training   day 3
Pm training day 3
 
INAAU Project Management for Telecommunications Professionals
INAAU Project Management for Telecommunications ProfessionalsINAAU Project Management for Telecommunications Professionals
INAAU Project Management for Telecommunications Professionals
 
103240-The-New-Way-of-Thinking-Our-Implementation-experience-with-Oracle-HCM-...
103240-The-New-Way-of-Thinking-Our-Implementation-experience-with-Oracle-HCM-...103240-The-New-Way-of-Thinking-Our-Implementation-experience-with-Oracle-HCM-...
103240-The-New-Way-of-Thinking-Our-Implementation-experience-with-Oracle-HCM-...
 
The Dashlane Agile Journey
The Dashlane Agile JourneyThe Dashlane Agile Journey
The Dashlane Agile Journey
 
Doing It On Your Own: When to Call in the Consultants, When to Leave Them Out
Doing It On Your Own: When to Call in the Consultants, When to Leave Them OutDoing It On Your Own: When to Call in the Consultants, When to Leave Them Out
Doing It On Your Own: When to Call in the Consultants, When to Leave Them Out
 
CRMready Webinar Series - Part 3 - How to Make Your Nonprofit’s CRM Implement...
CRMready Webinar Series - Part 3 - How to Make Your Nonprofit’s CRM Implement...CRMready Webinar Series - Part 3 - How to Make Your Nonprofit’s CRM Implement...
CRMready Webinar Series - Part 3 - How to Make Your Nonprofit’s CRM Implement...
 
Projects2016_Franks_Top10ReasonsProjectsFail
Projects2016_Franks_Top10ReasonsProjectsFailProjects2016_Franks_Top10ReasonsProjectsFail
Projects2016_Franks_Top10ReasonsProjectsFail
 
AVATA Webinar: Solutions to Common Demantra & ASCP Challenges
AVATA Webinar: Solutions to Common Demantra & ASCP ChallengesAVATA Webinar: Solutions to Common Demantra & ASCP Challenges
AVATA Webinar: Solutions to Common Demantra & ASCP Challenges
 
Success recipe for new IT projects-Agile way. Fail Fast, Fail Early
Success recipe for new IT projects-Agile way. Fail Fast, Fail EarlySuccess recipe for new IT projects-Agile way. Fail Fast, Fail Early
Success recipe for new IT projects-Agile way. Fail Fast, Fail Early
 
Management by Objectives from the views of Project Management and Coordination
Management by Objectives from the views of Project Management and CoordinationManagement by Objectives from the views of Project Management and Coordination
Management by Objectives from the views of Project Management and Coordination
 
XebiCon'17 : //Tam-tams// Voici l’histoire de la disparition des dinosaures d...
XebiCon'17 : //Tam-tams// Voici l’histoire de la disparition des dinosaures d...XebiCon'17 : //Tam-tams// Voici l’histoire de la disparition des dinosaures d...
XebiCon'17 : //Tam-tams// Voici l’histoire de la disparition des dinosaures d...
 
Fundamentals of Project Management
Fundamentals of Project ManagementFundamentals of Project Management
Fundamentals of Project Management
 
JC_Gabuya_Resume
JC_Gabuya_ResumeJC_Gabuya_Resume
JC_Gabuya_Resume
 
Visualisation&agile practices ai2014
Visualisation&agile practices ai2014Visualisation&agile practices ai2014
Visualisation&agile practices ai2014
 
Tackling the Fallacy of Agile
Tackling the Fallacy of Agile Tackling the Fallacy of Agile
Tackling the Fallacy of Agile
 

More from Michael Kehoe

More from Michael Kehoe (20)

eBPF Workshop
eBPF WorkshopeBPF Workshop
eBPF Workshop
 
eBPF Basics
eBPF BasicseBPF Basics
eBPF Basics
 
QConSF 2018: Building Production-Ready Applications
QConSF 2018: Building Production-Ready ApplicationsQConSF 2018: Building Production-Ready Applications
QConSF 2018: Building Production-Ready Applications
 
AllDayDevops: What the NTSB teaches us about incident management & postmortems
AllDayDevops: What the NTSB teaches us about incident management & postmortemsAllDayDevops: What the NTSB teaches us about incident management & postmortems
AllDayDevops: What the NTSB teaches us about incident management & postmortems
 
Linux Container Basics
Linux Container BasicsLinux Container Basics
Linux Container Basics
 
Papers We Love Sept. 2018: 007: Democratically Finding The Cause of Packet Drops
Papers We Love Sept. 2018: 007: Democratically Finding The Cause of Packet DropsPapers We Love Sept. 2018: 007: Democratically Finding The Cause of Packet Drops
Papers We Love Sept. 2018: 007: Democratically Finding The Cause of Packet Drops
 
What the NTSB teaches us about incident management & postmortems
What the NTSB teaches us about incident management & postmortemsWhat the NTSB teaches us about incident management & postmortems
What the NTSB teaches us about incident management & postmortems
 
PyBay 2018: Production-Ready Python Applications
PyBay 2018: Production-Ready Python ApplicationsPyBay 2018: Production-Ready Python Applications
PyBay 2018: Production-Ready Python Applications
 
The Next Wave of Reliability Engineering
The Next Wave of Reliability EngineeringThe Next Wave of Reliability Engineering
The Next Wave of Reliability Engineering
 
Building Production-Ready Microservices: DevopsExchangeSF
Building Production-Ready Microservices: DevopsExchangeSFBuilding Production-Ready Microservices: DevopsExchangeSF
Building Production-Ready Microservices: DevopsExchangeSF
 
SF Chaos Engineering Meetup: Building Disaster Recovery via Resilience Engine...
SF Chaos Engineering Meetup: Building Disaster Recovery via Resilience Engine...SF Chaos Engineering Meetup: Building Disaster Recovery via Resilience Engine...
SF Chaos Engineering Meetup: Building Disaster Recovery via Resilience Engine...
 
SRECon-Europe-2017: Reducing MTTR and False Escalations: Event Correlation at...
SRECon-Europe-2017: Reducing MTTR and False Escalations: Event Correlation at...SRECon-Europe-2017: Reducing MTTR and False Escalations: Event Correlation at...
SRECon-Europe-2017: Reducing MTTR and False Escalations: Event Correlation at...
 
SRECon-Europe-2017: Networks for SREs
SRECon-Europe-2017: Networks for SREsSRECon-Europe-2017: Networks for SREs
SRECon-Europe-2017: Networks for SREs
 
Velocity San Jose 2017: Traffic shifts: Avoiding disasters at scale
Velocity San Jose 2017: Traffic shifts: Avoiding disasters at scaleVelocity San Jose 2017: Traffic shifts: Avoiding disasters at scale
Velocity San Jose 2017: Traffic shifts: Avoiding disasters at scale
 
Reducing MTTR and False Escalations: Event Correlation at LinkedIn
Reducing MTTR and False Escalations: Event Correlation at LinkedInReducing MTTR and False Escalations: Event Correlation at LinkedIn
Reducing MTTR and False Escalations: Event Correlation at LinkedIn
 
APRICOT 2017: Trafficshifting: Avoiding Disasters & Improving Performance at ...
APRICOT 2017: Trafficshifting: Avoiding Disasters & Improving Performance at ...APRICOT 2017: Trafficshifting: Avoiding Disasters & Improving Performance at ...
APRICOT 2017: Trafficshifting: Avoiding Disasters & Improving Performance at ...
 
Couchbase Connect 2016: Monitoring Production Deployments The Tools – LinkedIn
Couchbase Connect 2016: Monitoring Production Deployments The Tools – LinkedInCouchbase Connect 2016: Monitoring Production Deployments The Tools – LinkedIn
Couchbase Connect 2016: Monitoring Production Deployments The Tools – LinkedIn
 
Couchbase Connect 2016
Couchbase Connect 2016Couchbase Connect 2016
Couchbase Connect 2016
 
Using SaltStack to Auto Triage and Remediate Production Systems
Using SaltStack to Auto Triage and Remediate Production SystemsUsing SaltStack to Auto Triage and Remediate Production Systems
Using SaltStack to Auto Triage and Remediate Production Systems
 
SRECon USA 2016: Growing your Entry Level Talent
SRECon USA 2016: Growing your Entry Level TalentSRECon USA 2016: Growing your Entry Level Talent
SRECon USA 2016: Growing your Entry Level Talent
 

Recently uploaded

Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Christo Ananth
 

Recently uploaded (20)

KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTINGMANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Glass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesGlass Ceramics: Processing and Properties
Glass Ceramics: Processing and Properties
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 

Code Yellow: Helping operations top-heavy teams the smart way

  • 1. Helping operations top-heavy teams the smart way Jeff Weiner Chief Executive Officer Michael Kehoe Staff Site Reliability Engineer Todd Palino Sr Staff Site Reliability Engineer
  • 2. This Is The Only Slide You May Need a Picture Of slideshare.net/ToddPalino slideshare.net/MichaelKehoe3
  • 3. Michael Kehoe $ WHOAMI • Staff Site Reliability Engineer @ LinkedIn • Production-SRE Team • Former Network Engineer at the University of Queensland
  • 4. Todd Palino $ WHOAMI • Senior Staff SRE @ LinkedIn • Capacity Engineering Team • Co-Author of Kafka: The Definitive Guide • Late of VeriSign Infrastructure Engineering
  • 5. When Operations Isn’t Perfect Code Yellow https://devops.com/code-yellow-when-operations-isnt-perfect/
  • 6. • How to quickly erase all your technical debt • How to change your engineering culture This talk is not
  • 7. • How to identify team anti-patterns • How to work through high toil • How to create sustainable workloads This talk is
  • 8. Today’s agenda 1 Background 2 Scenario 1: Traffic-SRE 3 Scenario 2: Kafka-SRE 4 Building A Formula For Success 5 Key Learnings 6 Q&A
  • 10. Personal Experience in the past two years ASSISTANCE RENDERED • Traffic-SRE: Technical Debt/ Resource Allocation • Voyager-SRE: Technical Debt • Capacity War-room • Espresso-SRE: Reliability • Kafka-SRE: Capacity and Alert Fatigue
  • 12. Problem Statement Technical Debt • Written documentation needed improvement • Deployment infrastructure needed investment • Alert Fatigue Traffic-SRE
  • 13. Problem Statement Resource Allocations • Backlog of work for clients • Staff shortage
  • 15.
  • 16. Problem Statement Capacity Planning • Multi-tenant Infrastructure • No resource controls • Unclear resource ownership • Ad-hoc capacity planning • Sudden 100% increase in traffic
  • 17. Problem Statement Alert Fatigue • Multiple applications overutilized • No time for proactive work • Most alerts non-actionable
  • 18. Building a formula for success
  • 20. Building a formula for success Define the areas that need attacking Problem Statement Communicate expectations with clients & partners Communication & Partnerships Define success criteria Exit Criteria Get the help that you require Resource Acquisition Plan for short-term & long-term Planning
  • 21. Define the areas that need attacking Problem Statement • Admit there is a problem • Measure the problem • Understand the problem • Determines underlying causes that need to be fixed Building a formula for success
  • 22. Define success criteria Exit Criteria • Define concrete goals • Define concrete success criteria • Measure via an operational metric • Measure via a project being completed • Define timelines for completion Building a formula for success
  • 23. Get the help you require Resource Acquisition • Ask other teams for help • Get dedicated engineers/ project managers/ other roles as required • Set exit-date for resources Building a formula for success
  • 24. Plan for the short-term & long-term Planning • Plan out short-term work • Plan out longer-term projects • Do they need to be rescheduled? • Prioritize work that will reduce toil & burnout (Automation + Measurement) Building a formula for success
  • 25. Communicate expectations with clients & partners Communicatio n & Partnerships • Communicate problem statement & exit criteria • Send regular progress updates • Ensure that stakeholders understand delays & expected outcomes Building a formula for success
  • 27. Key Learnings Measure toil/ overhead Measure Prioritize efforts to remove overhead/toil Prioritize Communicate with partners & teams Communicate
  • 28. Q&A