My talk with Jim Kimball on the tyranny of the SLA; in it, we:
- Deconstruct the purpose of the service level agreement
- Discuss pitfalls of aspects of common SLA clauses, including how current SLAs inhibit the development of resilient systems and the cultivation of a DevOps culture
- Explore other potential SLA models that could foster healthier organizational behaviors and dynamics, and ultimately result in better technical outcomes and therefore business outcomes.
3. THIS PRESENTATION WILL KEEP YOU
AWAKE 99.5% OF THE TIME.
@jpaulreed@jimkimball #velocityconf
4. THE MTTL WILL BE 4.2 MINUTES.
@jpaulreed@jimkimball #velocityconf
5. YOU AGREE THAT, AS A VALUED
CUSTOMER, ALL TWEETS REGARDING THIS
PRESENTATION MUST BE POSITIVE IN
NATURE
@jpaulreed@jimkimball #velocityconf
6. IN THE CASE OF DEGRADED
PERFORMANCE OF THE PRESENTATION,
OUR FINANCIAL LIABILITY IS LIMITED TO
THE ASSESSED VALUE OF THIS SESSION.*
@jpaulreed@jimkimball #velocityconf
7. IN THE CASE OF DEGRADED
PERFORMANCE OF THE PRESENTATION,
OUR FINANCIAL LIABILITY IS LIMITED TO
THE ASSESSED VALUE OF THIS SESSION.*
* O’Reilly’s valuation pending**
@jpaulreed@jimkimball #velocityconf
8. IN THE CASE OF DEGRADED
PERFORMANCE OF THE PRESENTATION,
OUR FINANCIAL LIABILITY IS LIMITED TO
THE ASSESSED VALUE OF THIS SESSION.*
** Value likely to be 1/100th of a cent
* O’Reilly’s valuation pending**
@jpaulreed@jimkimball #velocityconf
9. IN CASE OF EXCESSIVE CELL PHONE
UTILIZATION, YOUR CONFERENCE
ACCESS WILL BE REVOKED.
@jpaulreed@jimkimball #velocityconf
10. J. PAUL REED
• @JPAULREED ON
• HOST OF THE
@SHIPSHOWPODCAST
• 15+ YEARS IN BUILD/RELEASE
ENGINEERING
• WORK WITH ALL SORTS OF ORGS
ON “THE DEVOPS™”
• VISITING SCIENTIST/CHIEF
DELIVERY OFFICER AT PRAXISFLOW
@jpaulreed@jimkimball #velocityconf
11. JIM KIMBALL
• CTO, HEDGESERV
• 25 YEARS IN THE FINANCIAL
SOFTWARE INDUSTRY
• @JIMKIMBALL ON
• TOC ICO JONAH
• THOUGHTS ON LEADING SOFTWARE
ORGANIZATIONS AT
SHARINGLUNCH.TUMBLR.COM
@jpaulreed@jimkimball #velocityconf
16. Availability Year Quarter Month
90.0% 36.5 days 9 days 72 hours
99.0% 3.65 days 4.5 days 36 hours
99.5% 1.83 days 11.7 hours 3.6 hours
99.9% 8.76 hours 2.19 hours 43.8 mins
99.99% 52.6 mins 13.1 mins 4.38 mins
99.999% 5.26 mins 77.7 secs 25.9 secs
55.5555555% 162.2 days 40 days 13.3 days
Remember All Those Nines?
@jpaulreed@jimkimball #velocityconf
17. Availability Year Quarter Month
90.0% 36.5 days 9 days 72 hours
99.0% 3.65 days 4.5 days 36 hours
99.5% 1.83 days 11.7 hours 3.6 hours
99.9% 8.76 hours 2.19 hours 43.8 mins
99.99% 52.6 mins 13.1 mins 4.38 mins
99.999% 5.26 mins 77.7 secs 25.9 secs
55.5555555% 162.2 days 40 days 13.3 days
Remember All Those Nines?
@jpaulreed@jimkimball #velocityconf
18. Availability Year Quarter Month
90.0% 36.5 days 9 days 72 hours
99.0% 3.65 days 4.5 days 36 hours
99.5% 1.83 days 11.7 hours 3.6 hours
99.9% 8.76 hours 2.19 hours 43.8 mins
99.99% 52.6 mins 13.1 mins 4.38 mins
99.999% 5.26 mins 77.7 secs 25.9 secs
55.5555555% 162.2 days 40 days 13.3 days
Remember All Those Nines?
@jpaulreed@jimkimball #velocityconf
19. Availability Year Quarter Month
90.0% 36.5 days 9 days 72 hours
99.0% 3.65 days 4.5 days 36 hours
99.5% 1.83 days 11.7 hours 3.6 hours
99.9% 8.76 hours 2.19 hours 43.8 mins
99.99% 52.6 mins 13.1 mins 4.38 mins
99.999% 5.26 mins 77.7 secs 25.9 secs
55.5555555% 162.2 days 40 days 13.3 days
Remember All Those Nines?
@jpaulreed@jimkimball #velocityconf
20. Remember All Those Nines?
Availability Year Quarter Month
90.0% 36.5 days 9 days 72 hours
99.0% 3.65 days 4.5 days 36 hours
99.5% 1.83 days 11.7 hours 3.6 hours
99.9% 8.76 hours 2.19 hours 43.8 mins
99.99% 52.6 mins 13.1 mins 4.38 mins
99.999% 5.26 mins 77.7 secs 25.9 secs
55.5555555% 162.2 days 40 days 13.3 days
@jpaulreed@jimkimball #velocityconf
21. Definitions Are Hard
What is an “outage?”
Uptime vs. Availability
Maintenance windows?
“Acts of God”
@jpaulreed@jimkimball #velocityconf
26. Every conceivable thing has
been taken into consideration.
That’s why we have what we
call defense in depth.
Now that means backup
systems to backup systems to
backup systems. …
Even with a faulty relay,
even with a stuck valve,
that system works.
@jpaulreed@jimkimball #velocityconf
27. But we didn’t uncover
[the core], did we?
We stopped it in time for one
simple reason, and I told you
that: the system works.
Dammit, the system works.
That’s not the problem.
@jpaulreed@jimkimball #velocityconf
28. FOR ABOUT A CENTURY, THEN,
DETERMINISM WAS ASSUMED TO EXIST
AND TO BE THE FIRST REQUIREMENT TO BE
ABLE TO EXERT PRECISE CONTROL OVER
THE WORLD. THIS HAS COME TO DOMINATE
OUR ATTITUDES TOWARD CONTROL.
TODAY, DETERMINISM IS KNOWN TO BE
FUNDAMENTALLY FALSE, AND YET THE
ILLUSION OF DETERMINISM IS STILL
CLUNG ONTO WITH FERVOR IN OUR HUMAN
WORLD OF … COMPUTERS AND
INFORMATION SYSTEMS.
MARK BURGESS
IN SEARCH OF CERTAINTY
@jpaulreed@jimkimball #velocityconf
33. Amazon Web Services
First, which SLA?
But not for account suspensions and terminations
Ditto maintenance (as defined!)
Also ignore “failures of individual instances or volumes not
attributable to Region Unavailability”
Expect no more than a 10% credit. (Maybe 30%.)
(Doesn’t apply to one-time charges… aka “reserved instances.”)
@jpaulreed@jimkimball #velocityconf
34. SLAs ARE SAFEGUARDS YOU PUT INTO
BROKEN RELATIONSHIPS.
— Roy Rappaport, Netflix
@jpaulreed@jimkimball #velocityconf
36. Pagerduty
Dig for the SLA
Basic plan: “best effort”
Standard plan: “5 minutes”
Enterprise plan: Insurance ($3 million!)
But really: they focus on reliability and resilience
@jpaulreed@jimkimball #velocityconf
40. “100% SLA availability? Really?!”
“I’ve forwarded your question to our
attorney and he’s suggested that we
remove the reference to 100%.
So we’ll do that ASAP.”
@jpaulreed@jimkimball #velocityconf
41. “100% SLA availability? Really?!”
“I’ve forwarded your question to our
attorney and he’s suggested that we
remove the reference to 100%.
So we’ll do that ASAP.”
(They did.)
@jpaulreed@jimkimball #velocityconf
42. Why SLAs In The First Place?
@jpaulreed@jimkimball #velocityconf
43. THE SLA ELEPHANT IN THE DATACENTER@jpaulreed@jimkimball #velocityconf
47. Language Matters
Service Level Agreements
versus Service Level
Commitments
Service Level Agreements
as “relationship
agreements?”
@jpaulreed@jimkimball #velocityconf
54. Brené Brown
“One of the ways we deal
with it is: we numb.”
“We make everything
that’s uncertain, certain.”
“We perfect. And, more
dangerously, we perfect
our kids.”
@jpaulreed@jimkimball #velocityconf
55. Whether it’s a bailout, an oil
spill, a recall: we pretend like
what we’re doing doesn’t have
a huge impact on other people.
I would say to companies: this
isn’t our first rodeo, people.
We just need you to be
authentic and real and say:
We’re sorry. We’ll fix it.
@jpaulreed@jimkimball #velocityconf
60. A LONG-TERM RELATIONSHIP
BETWEEN PURCHASER AND SUPPLIER
IS NECESSARY FOR BEST ECONOMY.
…
MORE IMPORTANT THAN PRICE IN THE
JAPANESE WAY OF DOING BUSINESS
IS CONTINUAL IMPROVEMENT OF
QUALITY, WHICH CAN BE ACHIEVED
ONLY ON A LONG-TERM RELATIONSHIP
OF LOYALTY AND TRUST.
W. EDWARDS DEMING
OUT OF THE CRISIS
@jpaulreed@jimkimball #velocityconf
68. A Minimum Viable SLA
Covers all complexity domains
Involves the business through to the customer
Prompts good behavior among teams…
… and within the organization
Facilitates organizational / team learning
Lightweight as possible
@jpaulreed@jimkimball #velocityconf