Beyond the Buzzwords

•Download as PPTX, PDF•

3 likes•607 views

Beyond the Buzzwords - Duncan Winn, Keith Strini, Sean Keery Originally delivered at Cloud Foundry Summit Europe 2017 Basel Switzerland October 11, 2017

Technology

Beyond The Buzzwords
Duncan Winn | Platform Engineering | @duncwinn
Sean Keery | Minister of Chaos | @zgrinch
Keith Strini | Federal Practice Lead | @pivotal

Cloud Computing
Containers
Agile
DevOps
CI/CD
Platform Operations
Microservices
Cloud Native

Cloud
Native
CI/CD
Cloud AgileMicroservices
Platform
Operations
ContainersWhat is the VALUE?

verb
Estimate the monetary worth of (something): Hard ROI
• Removing Spend
• Hardware / Middleware / OS Reduction
• Automation
What is Value
noun
The importance, worth, or usefulness of something
• Faster Time to Market
• Innovation
• Delighted Customers

Effective Use of CAPEX
Eliminating technical debt earlier,
validated features,
continuous product evolution
reflecting changing user base
● Data driven decisions
● Higher customer spend ratio
per investment dollar
● Lower overall subscription
churn
● Less “restarts” more evolution
Continuous Experimentation
Reducing the risk of building the
wrong thing while nimbly
changing direction
● Distributed Tracing/Shared
Context (Fast Feedback)
● Identify & test assumptions
● Direct feedback to
Design/CFO/CEO
● Lower CAPEX per hypothesis
Cloud Native
Enablement
Cloud Native Org
PRACTICES PRACTICES
Waste Reduction
Leveraging a Platform with
cloud-ready workloads to
remove delivery constraints
● Paired Programming
● CI/CD and better QA/TDD
● Rel-Eng Intelligence
● Automated Resilient Ops
Operations + App Transformation
PRACTICES
Cloud Native ROI Continuum

Code Deploy Prod Support
Work Flow
Value Stream
Mapping
Request Delivery

Muda Type I
Non-value added activity, necessary for end customer
Muda Type II
Non-value added activity, unnecessary for end customer
What is muda - 無駄
Any process that consumes more resources than needed

DevOps Principles
Networkin
g
Admin
Security
Auditor
QA
Perf Test
Storage
Admin
App
Architect
Project
Manager
Sys
Admin
IaaS
Admin

Stability
● Blue/Green / Canaries
● Resilience
● Self Healing
Speed ● Env Setup
● Release / Day 2 Automation
Scalability ● Dynamic Routing
● On Demand / Auto Elasticity
Security
● Rotate
● Repair
● Repave
Savings
● Resource Consolidation
● Software Reduction
● Automation

Request Delivery
Lead Time
Value Added Process Time
Non-Value Added Activities (TYPE 2)
Lead
Time

PROVISION
ENV
VM/OS
Middleware
CODE
Develop
Func Test
RELEASE
CI/CD
Test/Stage
DAY 2 OPS
Monitor
Patch
Scale
Request Delivery
Lead Time
Value Added Process Time
Request Delivery
Lead Time
Value Added Process Time
Request Delivery
Lead Time
Value Added Process Time
Request Delivery
Lead Time
Value Added Process Time

An SLI is a quantitative
measure of some aspect of the
level of service that is provided
Service Level Indicators

An SLO is a target value, or
range of values, for a service
level that is measured by an
SLI
Service Level Objectives

SLAs are a contract with your users that
includes consequences of meeting (or
missing) the SLOs they contain
Service Level Agreements

SLIs and SLOs are crucial elements in the control loops used to identify
systemic value:
Monitor and measure the system’s SLIs.
Compare the SLIs to the SLOs, and decide whether or not action is needed.
If action is needed, figure out what needs to happen in order to meet the
target.
Take that action.
Review SLO’s.
Continuous Experimentation

Our Service Level Agreement will be “Real Time Readiness of
the Platform.”
Our Service Level Indicators and Objectives:
Cell Rep Time Synch < 5m
BBS Time to Run LRP Convergence > 10m
Auctioneer App Instances Placement Failures > 0.5
Auctioneer Task Placement Failures > 0.5
Waiting or Delays
Latency

Our SLA will be “Proactive
Security Mitigation.”
Our Service Level Indicators and Objectives:
Number of Authn Errors > 10 attempts
Number of Failed Logons > 4 attempts
Number of Forbidden SSH Sessions > 2
Defects
Errors

Our SLA will be “Proactive Scaling of the Platform.”
Our SLI and SLO’s:
Unhealthy Cells = 0
Remaining Memory - Cell Memory Chunks Available
> 4
Remaining Memory - Overall Memory Available > 4
Over-production or Extra Features
35
Saturatio
n

Transportation or Handoffs
Traffic
Our SLA will be “Proactive Scaling of our Apps.”
Our SLI and SLO’s:
Router Throughput > 10000 rps
# of Request per Application Instance > 1000 rps
# of Request per Application Function > 100 rps

Company Objective 1 - Release Money: Acquire + Retain Customers
- Key Result 1 - 40% Redux in OPEX
- Key Result 2 - 20% more efficient use of CAPEX for new customer acquisition
Company Objective 2 - Capture and Retain New Market Share
- Key Result 1 - 3 new revenue generating products / quarter
- Key Result 2 - 10% lower churn in new user base vs existing product churn
What If?

Effective CAPEX Expenditures
ROIs TRUE VALUE

Fundamentally changes product development - What if customer chose their own Adventure?
Streaming
(ABooks/Movies)
Physical Media
EPurchase
(Physical Prod)
Automated Rec Engine
Manual Rec Engine
No targeted ads
Not Tailored Sales
Tailored Sales
Targeted Ads
Automated Rec Engine
Manual Rec Engine

Complex Assumption Validation via Distributed Trace
Assumption 1 - 50%
B/G - 100%
B/G - 100%
B/G - 50%
B/G - 25%
B/G - 50 %
A/B Test
B/G - 40%
Assumption 3 - 0%
Assumption 2 - 50%
B/G - 60%
B/G - 50%
B/G - 25%
B/G - 25%
B/G - 50 %
B/G - 33 %
B/G - 33 %
B/G - 33 %

Transforming How The World Builds Software
© Copyright 2017 Pivotal Software, Inc. All rights Reserved.

What's hot

Backend Master | 3.4.1 Deploy - Deploy AutomationKyunghun Jeon

Building the Bridge to Enterprise DevOps SuccessXebiaLabs

Replace Outdated DevOps Tools with Innovative & Modern PipelinesDevOps.com

Cloud Provider MatchingIlyas Iyoob, Ph.D.

4.9.2013 Continuous Delivery - Extending Agile Development; A Lean ApproachIBM Rational

Rational Quality Manager af Lars Stensig Olesen, IBM DanmarkInfinIT - Innovationsnetværket for it

Atagg2015 - Agile Testing by Leveraging CloudAgile Testing Alliance

Site reliability engineeringJason Loeffler

Bugday bkk-2014 nitisak-auto_perfNitisak Mooltreesri

Role of Test Automation in Modern Software Delivery PipelinesKasun Kodagoda

Black Friday Performance Testing with HPE's Stormrunnerload 2016 (1)Jeffrey Nunn

When down is not good enough. SRE On Azure - PolarConfRene Van Osnabrugge

Neotys PAC 2018 - Bruno Da SilvaNeotys_Partner

Building Ops Automation in DevOpsDevOps.com

Analyze phase lean six sigma tollgate templateSteven Bonacorsi

DevOps Enterprise Summit: Mainframe Automated TestingDevOps for Enterprise Systems

A Better, Faster Pipeline for Software DeliveryGene Gotimer

PHD Virtual Backup v6 5 and ReliableDR v3.2Mark McHenry

Info Card - Techical Debt ManagementFabricio Epaminondas

Innovate Everywhere: Choosing the Right Tools When Building Your SRE ToolchainDevOps.com

What's hot (20)

Backend Master | 3.4.1 Deploy - Deploy Automation

Building the Bridge to Enterprise DevOps Success

Replace Outdated DevOps Tools with Innovative & Modern Pipelines

Cloud Provider Matching

4.9.2013 Continuous Delivery - Extending Agile Development; A Lean Approach

Rational Quality Manager af Lars Stensig Olesen, IBM Danmark

Atagg2015 - Agile Testing by Leveraging Cloud

Site reliability engineering

Bugday bkk-2014 nitisak-auto_perf

Role of Test Automation in Modern Software Delivery Pipelines

Black Friday Performance Testing with HPE's Stormrunnerload 2016 (1)

When down is not good enough. SRE On Azure - PolarConf

Neotys PAC 2018 - Bruno Da Silva

Building Ops Automation in DevOps

Analyze phase lean six sigma tollgate template

DevOps Enterprise Summit: Mainframe Automated Testing

A Better, Faster Pipeline for Software Delivery

PHD Virtual Backup v6 5 and ReliableDR v3.2

Info Card - Techical Debt Management

Innovate Everywhere: Choosing the Right Tools When Building Your SRE Toolchain

Similar to Beyond the Buzzwords

Using Lean Thinking to Identify and Address Delivery Pipeline BottlenecksIBM UrbanCode Products

Software Testing in a Digital Transformation JourneyAlan Cafruni Gularte

Understanding the TCO and ROI of Apache Kafka & Confluentconfluent

Using Lean Thinking to identify and address Delivery Pipeline bottlenecksSanjeev Sharma

Business Case Calculator for DevOps Initiatives - Leading credit card service...Capgemini

DevOps for Enterprise Systems - Sanjay ChandruNRB

Devops transformation in the Rational Collaborative Lifecycle OrganizationRobbie Minshall

DevOps 101 - IBM Impact 2014 Sanjeev Sharma

Behavior Driven Testing - A paradigm shiftAspire Systems

Measuring DevOps Impact to Boost EffectivenessVMware Tanzu

SafeNet EMS Showcase: Today's Evolving Licensing Landscapeguestab2d72b

SafeNet EMS Showcase: Ingredients for an Evolutionguestab2d72b

Quantifying DevOps Adoption Empirically for Demonstrable ROIDevOps for Enterprise Systems

Yellow beltFran Chesser

IBM Collaborative Lifecycle Management Solution for DevOps v6Strongback Consulting

Managed Services Using SLAs and KPIsProlifics

Improve phase lean six sigma tollgate templateSteven Bonacorsi

Fllow con 2014 gbgruver

2018 Pivotal DevOps Day_Pivotal 소개 및 세션 아젠다 소개VMware Tanzu Korea

Similar to Beyond the Buzzwords (20)

Using Lean Thinking to Identify and Address Delivery Pipeline Bottlenecks

Software Testing in a Digital Transformation Journey

Understanding the TCO and ROI of Apache Kafka & Confluent

Using Lean Thinking to identify and address Delivery Pipeline bottlenecks

Business Case Calculator for DevOps Initiatives - Leading credit card service...

DevOps for Enterprise Systems - Sanjay Chandru

Devops transformation in the Rational Collaborative Lifecycle Organization

DevOps 101 - IBM Impact 2014

Behavior Driven Testing - A paradigm shift

Measuring DevOps Impact to Boost Effectiveness

SafeNet EMS Showcase: Today's Evolving Licensing Landscape

SafeNet EMS Showcase: Ingredients for an Evolution

Quantifying DevOps Adoption Empirically for Demonstrable ROI

Yellow belt

IBM Collaborative Lifecycle Management Solution for DevOps v6

Managed Services Using SLAs and KPIs

Improve phase lean six sigma tollgate template

Fllow con 2014

2018 Pivotal DevOps Day_Pivotal 소개 및 세션 아젠다 소개

Recently uploaded

Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi

Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity

Anypoint Exchange: It’s Not Just a Repo!Manik S Magar

Artificial intelligence in cctv survelliance.pptxhariprasad279825

"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays

WordPress Websites for Engineers: Elevate Your Brandgvaughan

Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited

Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University

Commit 2024 - Secret Management made easyAlfredo García Lavilla

Install Stable Diffusion in windows machinePadma Pradeep

DMCC Future of Trade Web3 - Special EditionDubai Multi Commodity Centre

DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy

Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren

Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB

Powerpoint exploring the locations used in television show Time Clashcharlottematthew16

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang

My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada

SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero

Recently uploaded (20)

Vertex AI Gemini Prompt Engineering Tips

Dev Dives: Streamline document processing with UiPath Studio Web

Anypoint Exchange: It’s Not Just a Repo!

Artificial intelligence in cctv survelliance.pptx

"Debugging python applications inside k8s environment", Andrii Soldatenko

WordPress Websites for Engineers: Elevate Your Brand

Ensuring Technical Readiness For Copilot in Microsoft 365

Nell’iperspazio con Rocket: il Framework Web di Rust!

Commit 2024 - Secret Management made easy

Install Stable Diffusion in windows machine

DMCC Future of Trade Web3 - Special Edition

DevoxxFR 2024 Reproducible Builds with Apache Maven

Advanced Test Driven-Development @ php[tek] 2024

Developer Data Modeling Mistakes: From Postgres to NoSQL

Powerpoint exploring the locations used in television show Time Clash

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)

My Hashitalk Indonesia April 2024 Presentation

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024

SIP trunking in Janus @ Kamailio World 2024

Beyond the Buzzwords

2. IF WHAT

3. What if….

4. Cloud Computing Containers Agile DevOps CI/CD Platform Operations Microservices Cloud Native

5. Cloud Native CI/CD Cloud AgileMicroservices Platform Operations ContainersWhat is the VALUE?

6. VALUE? WHY QUANTIFY

8. VALUE? WHAT IS

9. verb Estimate the monetary worth of (something): Hard ROI • Removing Spend • Hardware / Middleware / OS Reduction • Automation What is Value noun The importance, worth, or usefulness of something • Faster Time to Market • Innovation • Delighted Customers

10. Effective Use of CAPEX Eliminating technical debt earlier, validated features, continuous product evolution reflecting changing user base ● Data driven decisions ● Higher customer spend ratio per investment dollar ● Lower overall subscription churn ● Less “restarts” more evolution Continuous Experimentation Reducing the risk of building the wrong thing while nimbly changing direction ● Distributed Tracing/Shared Context (Fast Feedback) ● Identify & test assumptions ● Direct feedback to Design/CFO/CEO ● Lower CAPEX per hypothesis Cloud Native Enablement Cloud Native Org PRACTICES PRACTICES Waste Reduction Leveraging a Platform with cloud-ready workloads to remove delivery constraints ● Paired Programming ● CI/CD and better QA/TDD ● Rel-Eng Intelligence ● Automated Resilient Ops Operations + App Transformation PRACTICES Cloud Native ROI Continuum

11. REDUCTION WASTE

12. Code Deploy Prod Support Work Flow Value Stream Mapping Request Delivery

13. Muda Type I Non-value added activity, necessary for end customer Muda Type II Non-value added activity, unnecessary for end customer What is muda - 無駄 Any process that consumes more resources than needed

14. DevOps Principles Networkin g Admin Security Auditor QA Perf Test Storage Admin App Architect Project Manager Sys Admin IaaS Admin

15. DevOps with Cloud Foundry

16. Stability ● Blue/Green / Canaries ● Resilience ● Self Healing Speed ● Env Setup ● Release / Day 2 Automation Scalability ● Dynamic Routing ● On Demand / Auto Elasticity Security ● Rotate ● Repair ● Repave Savings ● Resource Consolidation ● Software Reduction ● Automation

17. Request Delivery Lead Time Value Added Process Time Non-Value Added Activities (TYPE 2) Lead Time

18. PROVISION ENV VM/OS Middleware CODE Develop Func Test RELEASE CI/CD Test/Stage DAY 2 OPS Monitor Patch Scale Request Delivery Lead Time Value Added Process Time Request Delivery Lead Time Value Added Process Time Request Delivery Lead Time Value Added Process Time Request Delivery Lead Time Value Added Process Time

19.

20. ~30 to 50 people just patching VMs

21. Stability Speed Scalability

22. VALUE MEASURING

23. Dickerson's Hierarchy of Reliability

24. An SLI is a quantitative measure of some aspect of the level of service that is provided Service Level Indicators

25. An SLO is a target value, or range of values, for a service level that is measured by an SLI Service Level Objectives

26. SLAs are a contract with your users that includes consequences of meeting (or missing) the SLOs they contain Service Level Agreements

27. VALUE TEST DRIVEN OPERATIONS

28. SLIs and SLOs are crucial elements in the control loops used to identify systemic value: Monitor and measure the system’s SLIs. Compare the SLIs to the SLOs, and decide whether or not action is needed. If action is needed, figure out what needs to happen in order to meet the target. Take that action. Review SLO’s. Continuous Experimentation

29. Software Value Map

30. VALUE AVOID WASTE

31. Our Service Level Agreement will be “Real Time Readiness of the Platform.” Our Service Level Indicators and Objectives: Cell Rep Time Synch < 5m BBS Time to Run LRP Convergence > 10m Auctioneer App Instances Placement Failures > 0.5 Auctioneer Task Placement Failures > 0.5 Waiting or Delays Latency

32. Our SLA will be “Proactive Security Mitigation.” Our Service Level Indicators and Objectives: Number of Authn Errors > 10 attempts Number of Failed Logons > 4 attempts Number of Forbidden SSH Sessions > 2 Defects Errors

33. Our SLA will be “Proactive Scaling of the Platform.” Our SLI and SLO’s: Unhealthy Cells = 0 Remaining Memory - Cell Memory Chunks Available > 4 Remaining Memory - Overall Memory Available > 4 Over-production or Extra Features 35 Saturatio n

34. Transportation or Handoffs Traffic Our SLA will be “Proactive Scaling of our Apps.” Our SLI and SLO’s: Router Throughput > 10000 rps # of Request per Application Instance > 1000 rps # of Request per Application Function > 100 rps

35. Continuous Experimentation Cycles

36. Effective Use of CAPEX Eliminating technical debt earlier, validated features, continuous product evolution reflecting changing user base ● Data driven decisions ● Higher customer spend ratio per investment dollar ● Lower overall subscription churn ● Less “restarts” more evolution Continuous Experimentation Reducing the risk of building the wrong thing while nimbly changing direction ● Distributed Tracing/Shared Context (Fast Feedback) ● Identify & test assumptions ● Direct feedback to Design/CFO/CEO ● Lower CAPEX per hypothesis Cloud Native Enablement Cloud Native Org PRACTICES PRACTICES Waste Reduction Leveraging a Platform with cloud-ready workloads to remove delivery constraints ● Paired Programming ● CI/CD and better QA/TDD ● Rel-Eng Intelligence ● Automated Resilient Ops Operations + App Transformation PRACTICES Cloud Native ROI Continuum

37. VALUE FUTURE

38.

39. Company Objective 1 - Release Money: Acquire + Retain Customers - Key Result 1 - 40% Redux in OPEX - Key Result 2 - 20% more efficient use of CAPEX for new customer acquisition Company Objective 2 - Capture and Retain New Market Share - Key Result 1 - 3 new revenue generating products / quarter - Key Result 2 - 10% lower churn in new user base vs existing product churn What If?

40. EXPERIMENTATION CONTINUOUS

41.

42.

43. Effective CAPEX Expenditures ROIs TRUE VALUE

44.

45.

46. ADVENTURE CHOOSE YOUR OWN

47. Fundamentally changes product development - What if customer chose their own Adventure? Streaming (ABooks/Movies) Physical Media EPurchase (Physical Prod) Automated Rec Engine Manual Rec Engine No targeted ads Not Tailored Sales Tailored Sales Targeted Ads Automated Rec Engine Manual Rec Engine

48. Complex Assumption Validation via Distributed Trace Assumption 1 - 50% B/G - 100% B/G - 100% B/G - 50% B/G - 25% B/G - 50 % A/B Test B/G - 40% Assumption 3 - 0% Assumption 2 - 50% B/G - 60% B/G - 50% B/G - 25% B/G - 25% B/G - 50 % B/G - 33 % B/G - 33 % B/G - 33 %

49.

50. SUMMARY TAKE AWAYS

51. Effective Use of CAPEX Eliminating technical debt earlier, validated features, continuous product evolution reflecting changing user base ● Data driven decisions ● Higher customer spend ratio per investment dollar ● Lower overall subscription churn ● Less “restarts” more evolution Continuous Experimentation Reducing the risk of building the wrong thing while nimbly changing direction ● Distributed Tracing/Shared Context (Fast Feedback) ● Identify & test assumptions ● Direct feedback to Design/CFO/CEO ● Lower CAPEX per hypothesis Cloud Native Enablement Cloud Native Org PRACTICES PRACTICES Waste Reduction Leveraging a Platform with cloud-ready workloads to remove delivery constraints ● Paired Programming ● CI/CD and better QA/TDD ● Rel-Eng Intelligence ● Automated Resilient Ops Operations + App Transformation PRACTICES Cloud Native ROI Continuum

Editor's Notes

Good Morning, ….
THE essence of this talk a “what if” {CLICK}
WHAT IF {CLICK} For everything you build….. {CLICK} You can measure the VALUE. And by value we don’t man indicative value (such as simply believing this feature is important) {CLICK} we mean quantitive value - the feature has produced this much revenue. {CLICK} This talk will unpack how, through using Cloud Foundry and BUILDING a continuous feedback loop to measure the correct set of Service Level Indicators, You can start to model your software delivery practises around MEASURABLE value. {CLICK}
To backup a little: This title of this talk “buzzword bingo” came about because my colleagues and I have spent a considerable portion of our career explaining buzzwords to people By buzzwords, for anyone who’s not heard my previous buzzword talk - I don’t just mean the literal definition - I mean” Actually taking the time to understand the value intended behind these various trends in IT. For example Agile being more than sprints and stand-ups it’s about getting features into the hands of end users quickly. DevOps being an actual culture and not just a tool chain. And to that point, as some of you know - I wrote a book on Cloud Foundry and how CF supports these various trends {CLICK}
Once you have all these buzzwords down: You’ve chosen your IaaS layer and built a hardened CF environment Your’ve defined your container strategy and leveraged your app architecture - probably involving some level of micro services You’ve structured you teams to adopt a dev ops culture with Agile delivery - underpinned by a centralised platform operations team you use pipeline to deploy EVERYTHING at this point your are probably feeling pretty good about yourself as you’ve obtained the Cloud Native Jedi Status {CLICK} But to get here comes at a cost - both in terms of time and resource. And at this point folks often start to question the value. {CLICK} Now first question to ask here is… {CLICK}
WHY do we we care - why quantify the value. This is a question you should ask yourselves daily in whatever you do. Why should people care about that value you deliver - and if you don’t know the answer - or at least if you can’t find that answer out - maybe you should stop doing what you are doing. {CLICK}
You’ve build our your Cloud Native Ecosystem {CLICK} As a platform operator and platform champion you are really proud of this environment and so you start to tell other LOB’s about it and then many many developers like it and start to use it. {CLICK} At this point - the senior exec tends to take notice and they start to question is this is strategy that is good for select LOB or should they start to roll this out to all the other LOBs - and to make that decision they often want to know “the value” {CLICK} So how to you start to measure and quantify the value behind this decision?
Before we Jump into measuring and quantifying the value behind your Cloud Native decision? We need to spend a bit of time really unpacking what we mean by value. {CLICK}
When we talk about Value it means different things to different people so let’s break it down: {CLICK} There is indicative value - the noun - which is harder to measure and quantify - and because it can be intangible - it may be harder to realise or quantify monetary gain as a result. {CLICK} The noun is where most Cloud Foundry talks spend their time - these amazing bold statements of what could be done in 10 months can now be done in 10 weeks. {CLICK} There there is the verb - the doing - the realisation of value. {CLICK} The verb is the Hard ROI - for example the removal of fixed costs
For this talk we are going to look at the progression of realising value, which is really the Cloud Native ROI Continuum SEAN : It starts with Waste Reduction The first is waste reduction (both Opex/capex reduction)
So we start our value journey with waste reduction. This is the most fundamental task to tackle {CLICK}
The best way to quantify waste reduction is through an activity called value stream mapping. Value stream mapping is a lean-management method for analyzing: the current state and then designing a future state. {CLICK} A value stream is defined as the flow of work from a request to the DELIVERY. {CLICK}
There is an ethos in LEAN - around waste reduction. VSM is primarily concerned with reducing as much waste as possible. It stems from a word Moo-da. Example - platform upgrades (for a CVE) / packaging a release.
LEAN Theory started off in manufacturing and transitioned to becoming popular within DevOps communities. And it’s easy to see why: Because each value stream typically cross multiple functions within an organisation. {CLICK} It’s easy to see that by aligning to a DevOps culture you can begin to reduce handoffs and eliminate waste.
So the adoption of DevOps is one way of reducing waste. But couple DevOps with using CF in the correct way you achieve a significant amount of waste reduction. And that’s because CF offers a Unified platform strategy A single point of convergence that everyone rallies behind to obtain a distinct set of benefits {CLICK}
You get increased speed through reducing waste in setting up new environments. I have a question for the audience: without using CF how long does it take to provision a VM? You ask an operator and typically they’ll say something in the order of minutes to hours. You ask a developer and they will typically tell you a matter of weeks. The worst time lane for provisioning a VM - I kid you not is 18 months. With Cloud Foundry your env setup is already set up - no need to raise a ticket and get a VM and middleware set up. This dynamic provisioning of environments reduces operations and saves considerable time. {CLICK} For stability - BOSH and CF are self healing - and further - you can easily leverage deployment patterns such a blue/green and feature flags. This saves operational resource. {CLICK} You can auto scale the platform and apps based on set criteria, and you remove the need for manual tasks like route configuration for every new app. {CLICK} And finally but arguably most importantly - the security posture gets significantly strengthened. So what that means for reducing waste is through a mix of increased automation and self services. Cloud Foundry Provides {CLICK} Better resource consolidation resulting in HW and SW reduction and more efficient use of operators time and skills.
It assesses the events required to take a feature from inception to production. So what else can be done. Remove all non value add activities make lead time = process time
The best way to understand where problems start is by performing an activity called value stream mapping. Every organization has many value streams, defined as the flow of work from a customer request to the fulfillment of that request. Each value stream will cross multiple functions within an organization Value stream mapping is a lean-management method for analyzing the current state and designing a future state for the series of events that take a product or service from its beginning through to the customer.
Every organisation has many value streams, Each value stream will cross multiple functions within an organisation And it’s important to map out the entire flow from request to delivery for every Value stream that feeds into software delivery.
The best way to understand where problems start is by performing an activity called value stream mapping. Every organization has many value streams, defined as the flow of work from a customer request to the fulfillment of that request. Each value stream will cross multiple functions within an organization Value stream mapping is a lean-management method for analyzing the current state and designing a future state for the series of events that take a product or service from its beginning through to the customer.
We’ve done this for some of our customers. When you actually sit down and map our the effort that goes into something simple like creating a VM - the results can be staggering.
The best way to understand where problems start is by performing an activity called value stream mapping. Every organization has many value streams, defined as the flow of work from a customer request to the fulfillment of that request. Each value stream will cross multiple functions within an organization Value stream mapping is a lean-management method for analyzing the current state and designing a future state for the series of events that take a product or service from its beginning through to the customer.
Introduce yourself Now you’ve become fully cloud native - what about realizing the value? Going Cloud Foundry is not for free - it takes time effort change - all things people often associate with pain. So what is the value
Buzzwords Google SRE Golden Signals? Resilient operations Bottom three roll up to test-driven operations Add scaling events to muda
Example : Number of people in the room Most services consider request latency—how long it takes to return a response to a request—as a key SLI. Other common SLIs include the error rate, often expressed as a fraction of all requests received, and system throughput, typically measured in requests per second. The measurements are often aggregated: i.e., raw data is collected over a measurement window and then turned into a rate, average, or percentile. Ideally, the SLI directly measures a service level of interest, but sometimes only a proxy is available because the desired measure may be hard to obtain or interpret. For example, client-side latency is often the more user-relevant metric, but it might only be possible to measure latency at the server. Another kind of SLI important to SREs is availability, or the fraction of the time that a service is usable. It is often defined in terms of the fraction of well-formed requests that succeed, sometimes called yield. (Durability—the likelihood that data will be retained over a long period of time—is equally important for data storage systems.) Although 100% availability is impossible, near-100% availability is often readily achievable, and the industry commonly expresses high-availability values in terms of the number of “nines” in the availability percentage. For example, availabilities of 99% and 99.999% can be referred to as “2 nines” and “5 nines” availability, respectively, and the current published target for Google Compute Engine availability is “three and a half nines”—99.95% availability.
Example: Duncan thought we could get 40. Stretch goal was 50 A natural structure for SLOs is thus SLI ≤ target, or lower bound ≤ SLI ≤ upper bound. For example, we might decide that we will return Shakespeare search results “quickly,” adopting an SLO that our average search request latency should be less than 100 milliseconds. Alert on these. Choosing an appropriate SLO is complex. To begin with, you don’t always get to choose its value! For incoming HTTP requests from the outside world to your service, the queries per second (QPS) metric is essentially determined by the desires of your users, and you can’t really set an SLO for that. On the other hand, you can say that you want the average latency per request to be under 100 milliseconds, and setting such a goal could in turn motivate you to write your frontend with low-latency behaviors of various kinds or to buy certain kinds of low-latency equipment. (100 milliseconds is obviously an arbitrary value, but in general lower latency numbers are good. There are excellent reasons to believe that fast is better than slow, and that user-experienced latency above certain values actually drives people away— see “Speed Matters” [Bru09] for more details.) Again, this is more subtle than it might at first appear, in that those two SLIs—QPS and latency—might be connected behind the scenes: higher QPS often leads to larger latencies, and it’s common for services to have a performance cliff beyond some load threshold. Choosing and publishing SLOs to users sets expectations about how a service will perform. This strategy can reduce unfounded complaints to service owners about, for example, the service being slow. Without an explicit SLO, users often develop their own beliefs about desired performance, which may be unrelated to the beliefs held by the people designing and operating the service. This dynamic can lead to both over-reliance on the service, when users incorrectly believe that a service will be more available than it actually is (as happened with Chubby: see “The Global Chubby Planned Outage”), and under-reliance, when prospective users believe a system is flakier and less reliable than it actually is.
Example: if we didn’t get 50 people signed up, Sean will trim his beard The consequences are most easily recognized when they are financial—a rebate or a penalty—but they can take other forms. An easy way to tell the difference between an SLO and an SLA is to ask “what happens if the SLOs aren’t met?”: if there is no explicit consequence, then you are almost certainly looking at an SLO.
So we start our value journey with waste reduction. This is the most fundamental task to tackle {CLICK}
Find your biggest waste, eliminate it. Hypothesize on your next one. Validate through metrics. Fix it. Re-evaluate your objectives periodically.
Inspired by balance scorecard We’ll map the five measurable value components to our SLA/I/O’s “VC1.1.1.1: Functionality”, “VC1.1.1.2: Reliability”, “VC1.1.1.3: Usability”, “VC1.1.1.4: Maintainability” and “VC1.1.1.5: Portability” Value perspective (VP) Value aspect (VA) Subvalue aspect (SVA) Value component (VC) An excerpt from Software Value Map (Khurum, et al..2012)
Now you’ve become fully cloud native - what about realizing the value? Going Cloud Foundry is not for free - it takes time effort change - all things people often associate with pain. So what is the value
Example value lost through waste: Can’t push my bug fix for app People waiting for information: Waiting for getting complete requirements from • the customer • Waiting for months for project approval • Waiting for resources to be assigned • Waiting for the whole system to be done before one can get the key features that he/she really needs • Information waiting for people: detailed requirements specification created upfront Latency in your systems - try some distributed tracing to find your bottlenecks Cycle time Whenever goods are not in transport or being processed, they are waiting. In traditional processes, a large part of an individual product's life is spent waiting to be worked on. Metric: Time to deliver platform updates; downtime;Lag between updates and upgrades of platforms (stem cells, build packs, versions.); o Cell Rep Time Synch < 5m o BBS Time to Run LRP Convergence > 10m o Auctioneer App Instances Placement Failures > 0.5 o Auctioneer Task Placement Failures > 0.5 o Cloud Controller and Diego in Synch o Router Server Error o Router Error: 502 Bad Gateway
Example: $300k AWS bill Leaving testing towards the end • Not finding defects as early as possible Lack of disciplined reviews, tests, verification Whenever defects occur, extra costs are incurred reworking the part, rescheduling production, etc. This results in labor costs, more time in the "Work-in-progress". Defects in practice can sometimes double the cost of one single product. This should not be passed on to the consumer and should be taken as a loss Metric: Failed BOSH deployments; Failed application starts; Failed stagings; Application errors; Routes not found; Failing smoke & acceptance tests. - If SLO is Proactive Security Mitigation - SLIs are o Number for Authn errors > _attempts o Number of Failed Logons > _attempts o Number forbidden ssh sessions > _attempts
Example: Black friday crash - Target Overproduction occurs when more product is produced than is required at that time by your customers. Tie this to capacity planning in Dickerson hierarchy Functionality that is not required by the customer Features for which markets are not ready Do you need 18 foundations or five availability zones Metric: Cell capacity used Creation of unnecessary data and informatio One common practice that leads to this muda is the production of large batches, as often consumer needs change over the long times large batches require. Overproduction is considered the worst muda [8] because it hides and/or generates all the others. Overproduction leads to excess inventory, which then requires the expenditure of resources on storage space and preservation, activities that do not benefit the customer. Router Latency > 50_ms o VM Health o VM Memory Used > 80% o VM CPU utilization > 85%
Example - Self DDOS through retry storm without circuit breaker or exponential backoff Difficulty to transfer tacit knowledge (for example, design decisions and rationale) • Incompatible information types (drawings vs. digital descriptions) Incompatible software systems or tools • Lack of availability, knowledge, or training in conversion and linking systems Lack of access Information silos Each time a product is moved it stands the risk of being damaged, lost, delayed, etc. as well as being a cost for no added value. Transportation does not make any transformation to the product that the consumer is willing to pay for.
I’ve bounded my discussion of value stream mapping to platform operations. Keith is going to talk about how this fits into your Product development value stream
For this talk we are going to look at the progression of realising value, which is really the Cloud Native ROI Continuum SEAN : It starts with Waste Reduction The first is waste reduction (both Opex/capex reduction)
Now you’ve become fully cloud native - what about realizing the value? Going Cloud Foundry is not for free - it takes time effort change - all things people often associate with pain. So what is the value
OPEX reduction is an excellent opening line, but it isn’t the true promise of the conversation. We need to move the conversation forward from the reduction in software capitalization to the effective use of CAPEX.
This is achieved by making hypothesis market testing a first class citizen in feedback loops to CFO/CEO decision making. Observing users dealing with a problem, creating assumptions, continuously validating those assumptions and estimating what value a feature you can provide that they recognize will solve it and are willing to pay/subscribe for it. This means features that don't resonate get sunset quicker, reducing technical debt sooner simultaneously increasing stickiness by continued budget for development and evolution. The savings from pruning technical debt early is quickly reallocated to new hypotheses.
Product development can remain objective and more focused on what the user is telling them rather than guessing product directions based unvalidated assumptions. This allows them to articulate what the instrumentation has discovered into actionable insights.
It liberates the product/company enabling them to evolve with customer’s changes in value and preferences. Even after product features have been delivered, the granularity of microservices design promises the clean modularity to continuously collaborate and learn with your evolving user base Circumstances change, therefore products can remain relevant long after the initial benefits
A/B and B/G testing tell us yes/no but not the why. These in conjunction with microservices and distributed tracing can be used as primitives to build up complex hypotheses that are objectively instrumented that allow us to verify what we believed going into each release, what behavior was validated and what behavior was invalidated.
Now you’ve become fully cloud native - what about realizing the value? Going Cloud Foundry is not for free - it takes time effort change - all things people often associate with pain. So what is the value
Marketing shifts from telling users how to use the product to responding to how the users want to use the products creating a stickier user base and attracting new subscribers. Job’s Theory alludes to this aspect in its postulation that users don’t simple use products, they hire products to complete a job for them. The job is the progress that a person is trying to make in a particular circumstance.
Sean’s live demo New World of Customer Understanding Instrumenting highest bounce rate services with gamification to gather motivation and intent Clearly understand abandonment points in your funnel Incentivize exit survey behaviors Clearly understand what is most valuable to your product B/G deployments help isolate the attraction and understand how to expand the attraction to other areas in the product UI/UX Heat Maps combined with A/B with Navigation and Design Layout Variations Is it a UI/UX problem or usability problem A/B with entire workflows Understand which threshold signifies the conversion tipping point Measure impact through the entire funnel not just the application as a whole
All of this now measurable with the granularity of cloud native/microservices architecture and the fact that platform's like PCF offload the technical complexity of managing them. This type of agility to respond to a market is the panacea of all of this buzz.
So we start our value journey with waste reduction. This is the most fundamental task to tackle {CLICK}
For this talk we are going to look at the progression of realising value, which is really the Cloud Native ROI Continuum SEAN : It starts with Waste Reduction The first is waste reduction (both Opex/capex reduction)

Beyond the Buzzwords

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Beyond the Buzzwords

Similar to Beyond the Buzzwords (20)

Recently uploaded

Recently uploaded (20)

Beyond the Buzzwords

Editor's Notes