Ce diaporama a bien été signalé.
Le téléchargement de votre SlideShare est en cours. ×

Estimation is dead - long live sizing, by John Coleman 24Nov22.pdf

Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité

Consultez-les par la suite

1 sur 48 Publicité

Estimation is dead - long live sizing, by John Coleman 24Nov22.pdf

Télécharger pour lire hors ligne

Estimation is dead - long live sizing, by John Coleman 24 Nov 22 to Agile Azerbaijan in person and Pozitive Technologies online

As per https://www.infoq.com/articles/sizing-forecasting-scrum/

Estimation is dead - long live sizing, by John Coleman 24 Nov 22 to Agile Azerbaijan in person and Pozitive Technologies online

As per https://www.infoq.com/articles/sizing-forecasting-scrum/

Publicité
Publicité

Plus De Contenu Connexe

Plus récents (20)

Publicité

Estimation is dead - long live sizing, by John Coleman 24Nov22.pdf

  1. 1. Estimation is dead - long live sizing John Coleman @JohnColemanIRL https://linktr.ee/johncolemanxagility https://www.infoq.com/articles/sizing- forecasting-scrum/
  2. 2. Estimate is no longer in the Scrum guide Product Backlog The Product Backlog is an emergent, ordered list of what is needed to improve the product. It is the single source of work undertaken by the Scrum Team. Product Backlog items that can be Done by the Scrum Team within one Sprint are deemed ready for selection in a Sprint Planning event. They usually acquire this degree of transparency after refining activities. Product Backlog refinement is the act of breaking down and further defining Product Backlog items into smaller more precise items. This is an ongoing activity to add details, such as a description, order, and size. Attributes often vary with the domain of work. The Developers who will be doing the work are responsible for the sizing. The Product Owner may influence the Developers by helping them understand and select trade-offs.
  3. 3. Forecast in the Scrum guide The Sprint Various practices exist to forecast progress, like burn-downs, burn-ups, or cumulative flows. While proven useful, these do not replace the importance of empiricism. In complex environments, what will happen is unknown. Only what has already happened may be used for forward-looking decision making. Sprint Planning Selecting how much can be completed within a Sprint may be challenging. However, the more the Developers know about their past performance, their upcoming capacity, and their Definition of Done, the more confident they will be in their Sprint forecasts.
  4. 4. So is estimation dead? First, let’s look at sizing
  5. 5. Sizing caveats People who do the work do the sizing, no one else! Complex work is uncomparable - when dealing with complexity, know that these techniques are almost always inaccurate If we don’t ”clean up up the kitchen” as a habit, the accumulation of mess will lead work will take longer than before The most popular sizing techniques are either based on data or educated guesses
  6. 6. Sizing options Relative estimation Flow metrics or counting valuable items Rightsizing #Noestimates
  7. 7. Flow metrics - Kanban Guide for Scrum Teams Throughput: The number of product backlog items finished per unit of time Cycle Time: End-date minus Start-date +1 Work Item Age: The elapsed calendar time between when a work item started and the current time; this applies only to items still in progress Work in Progress (WIP): The number of work items started but not finished
  8. 8. Flow metrics
  9. 9. Relative estimation 1 of 2 Time reference - Comparing current work items to the time it took to complete historical reference items Assigning numeric values - Examples include using story points based on the Fibonacci sequence and often carried out collaboratively with playing cards (planning poker) T-shirt sizing - Assigning s, s/m, m, m/l, xl, xxl, xxxl, xxxxl to Product Backlog Items instead of numeric value Wall estimation - Assigning numeric values by collaboratively placing and moving cards on a wall, also referred to as magic estimation or silent estimation
  10. 10. Relative estimation 2 of 2 It comes in different flavors If you estimate, the best thing that can happen is the estimates are correct Estimates are prone to the "flaw of averages" (Sam Savage). Is 50:50 an excellent way to set expectations? The average of independent blind assessments can be near enough to the truth (credit to Dave Snowden) - how often estimates blind in Scrum Teams though If you don't estimate at all, you don't waste time; hopefully, you will discover/deliver outcomes sooner
  11. 11. Rightsizing How much time could you save caring more about whether the team can complete an item within the Sprint and less about making that item infinitely smaller? Think of the reduced cognitive load on the Product Owner resulting from fewer PBIs Counting the number of (valuable right-sized) PBIs delivered to Done per Sprint is valuable for Sprint Planning and forecasting goals If throughput is sporadic or irregular, we have more significant problems than forecasting; we have a "plumbing problem" Using average throughput also pursues the "flaw of averages;" Monte Carlo probabilistic forecasting is preferable
  12. 12. One interpretation of #NoEstimates STRIVE FOR AN EVEN DISTRIBUTION OF "BALLPARK" ITEM SIZES THROUGHOUT A BACKLOG COUNT RUNNING TESTED STORIES OR RUNNING TESTED FEATURES TO DEMONSTRATE PROGRESS IN OUTPUT TERMS FOCUS ON SIMPLIFYING THE WHAT FOR THE WHY - A FOCUS ON DESIRED OUTCOMES RIGHT-SIZING - IDENTIFY SMALL ENOUGH ITEMS FOR INTAKE SLICING INTO 24-HOUR TIMEBOXING OF ITEMS ENCOURAGES THE CREATION OF EXPERIMENTS THAT VALIDATE ASSUMPTIONS/HYPOTHE SES TOWARDS A GOAL, DISCOVER TO DELIVER USE ROLLING-WAVE FORECASTS TO COMMUNICATE UNCERTAINTY
  13. 13. One interpretation of #NoEstimates • Counting the number of (valuable right- sized) PBIs delivered to Done per Sprint is valuable for Sprint Planning and forecasting goals • "Rolling Wave Forecast” based on throughput with variance limits is preferable
  14. 14. Time reference Potential downsides Requires suitable reference items from the past Prone to abuse be people with a focus on people utilization Unsuitable for probabilistic forecasting Potential upsides Speaks in the customers language Easy to pick reference items from the past Waiting time is included in our memory of how long it takes Simple to do
  15. 15. Story points Potential upsides Useful to avoid bringing “elephants” into Work In Progress Could be used to limit work in progress Easy to pick reference items from the past Simple to do Developers like the conversation it triggers Often paired with t-shirt sizing or wall estimation Could be combined with probabilistic forecasting, but should it? Potential downsides Creator regrets story points Only for the team Story point inflation BS story points Often paired with planning poker (time consuming)
  16. 16. T-shirt sizes Potential upsides Useful to avoid bringing “elephants” into Work In Progress Could be used to limit work in progress Easy to pick reference items from the past Developers like the conversation it triggers Simple to do Requires very little detail Potential downsides Converted to numbers quite often, numbers that get used to forecast when work might be done
  17. 17. Wall / table estimation Potential upsides Useful to avoid bringing “elephants” into Work In Progress Could be used to limit work in progress Easy to pick reference items from the past Developers like the conversation it triggers Simple to do Requires very little detail Guesstimate for potential value sized as well as effort typically, priming ordering for value divided by size Really quick Potential downsides Converted to numbers quite often, numbers that get used to forecast when work might be done Often one and done – should be revisited regularly
  18. 18. Guesstimating / counting the number/ range of items to deliver a goal Potential upsides Suitable for recurring probabilistic forecasting or rolling-wave forecasting, giving dates and uncertainty Developers can “ballpark” the range Useful for sizing a chunk of Product Backlog, e.g, “elephant” sized items in the Product Backlog Can be used across teams Potential downsides People prefer relative sizing, and almost “cannot let go” Misunderstood that all items need to be of equal size For non-software different product backlog item render it like comparing apples with oranges Prone to the use of averages
  19. 19. Rightsizing Potential upsides Simple Less “analysis paralysis” Supports recurring probabilistic forecasting Potential downsides Items right-sized just in time or in product backlog refinement Misunderstood that all right sized items must be of equal size Disconnect in Kanban community about use of item split rate to support probabilistic forecasting If most days a team has no throughput, probabilistic forecasting will have low quality
  20. 20. #NoEstimates Potential upsides Split items as necessary, potentially into discovery items Small batch is the goal Forecasting using data – “running tested stories” Accepts uncertainty and imperfect information Useful for recurring forecasts Low time investment Seeks a mixture of item sizes Potential downsides In the wrong hands, splitting items into nonvaluable items People prefer to be wrong than uncertain
  21. 21. So is estimation dead? Now, let’s look at the disparity between sizing and how long work takes
  22. 22. “John, that’s about ten minutes of work, but things are so crap around here, make that three days” Estimated effort has little to do with how long something takes
  23. 23. Variable quality with sizing an item Factors for how long things take The batch size – the level of effort actually needed Waiting time … Sizing for the level of effort considers Complexity of the work Riskiness of the work Whether we did something similar before Perception of skill levels required to complete the work and availability of those skills Availability of tools and skills using those tools If you’re good, dependencies
  24. 24. 100% Resource utilization = 0% flow (Henrik Kniberg) Image courtesy of LeSS.works
  25. 25. If your forecasts are routinely correct, you're a freak of nature Forecasting is rarely perfect due to the following: •Waiting time due to dependencies is a huge factor in how long work takes and is affected by many unpredictable events. •Even in straightforward work environments, people overestimate how efficiently their day will go. •Often, people doing complex work in the pursuit of speed leave work behind them that is untidy and potentially embarrassing (accidental complication). •Complex work involves many unknown variables. •Lack of focus •Changing priorities
  26. 26. Monte Carlo simulations model a future based on data and assumptions
  27. 27. Monte Carlo simulations model a future based on data and assumptions Forecasting, at its essence, is about risk management It answers the question - How much risk is contained in our current plans? Lower quality forecasts also mean inadequate risk management
  28. 28. Estimation qService Level Expectation based on an educated guess, e.g., 85% of right sized items are done in 18 days or less qIndividual item sizing – useful if you only need to focus on one next unstarted item qGuesstimate Probabilistic item forecast - 90% guesstimating a range of a number of valuable items to deliver a goal, based on guesstimate min/max range of valuable items qProbabilistic guesstimate story point forecast - 90% guesstimating a range of a number of story points to deliver a goal, based on guesstimate min/max range qStory point range - 90% guesstimating a range of a number of story points to deliver a goal and using probabilistic forecasting based on guesstimate min/max range Options for managing expectations
  29. 29. qService Level Expectation based on cycle time data, e.g., 85% of right sized items are done in 18 days or less qIndividual item age – useful if you only need to focus on one next started item but unfinished item qData Probabilistic item forecast - 90% guesstimating a range of a number of valuable items to deliver a goal, based on throughput data of valuable items qRolling wave forecast - Throughput data range - 90% guesstimating a range of a number of items to deliver a goal and using throughput data qThroughput data average - Best guess of number of valuable items divided by average throughput data (number of items done) per day/week/sprint/month… qProbabilistic story point forecast - 90% guesstimating a range of a range story points to deliver a goal and using probabilistic forecasting based on story points data qStory point data average - Best guess of number of story points divided by average number of points really done per day/week/sprint/month… qCounting subtasks - Best guess of number of non-valuable items divided by average throughput (number of non-valuable items done) per day/week/sprint/month… Options for managing expectations Forecasting
  30. 30. Estimation qService Level Expectation based on an educated guess, e.g., 85% of right sized items are done in 18 days or less qIndividual item sizing – useful if you only need to focus on one next unstarted item qGuesstimate Probabilistic item forecast - 90% guesstimating a range of a number of valuable items to deliver a goal, based on guesstimate min/max range of valuable items qProbabilistic guesstimate story point forecast - 90% guesstimating a range of a number of story points to deliver a goal, based on guesstimate min/max range qStory point range - 90% guesstimating a range of a number of story points to deliver a goal and using probabilistic forecasting based on guesstimate min/max range Forecasting qService Level Expectation based on cycle time data, e.g., 85% of right sized items are done in 18 days or less qIndividual item age – useful if you only need to focus on one next started item but unfinished item qData Probabilistic item forecast - 90% guesstimating a range of a number of valuable items to deliver a goal, based on throughput data of valuable items qRolling wave forecast - Throughput data range - 90% guesstimating a range of a number of items to deliver a goal and using throughput data qThroughput data average - Best guess of number of valuable items divided by average throughput data (number of items done) per day/week/sprint/month… qProbabilistic story point forecast - 90% guesstimating a range of a range story points to deliver a goal and using probabilistic forecasting based on story points data qStory point data average - Best guess of number of story points divided by average number of points really done per day/week/sprint/month… qCounting subtasks - Best guess of number of non-valuable items divided by average throughput (number of non-valuable items done) per day/week/sprint/month… Options for managing expectations
  31. 31. Better options Manage expectations about uncertainty not dates qWe're using an empirical approach operating one Sprint at a time qThe Sprint Goal is not even a guarantee qThe real answer is we don't know, but let's start and learn quickly" qYou might not even use Now?, Next ??, Later ??? Being agile - don't manage expectations at all, let people go see qDiscover and deliver capabilities qReview outcomes with the customers and end-users qLearn what can be learned qAct on what we have discovered
  32. 32. Key take aways Avoid story points, counting non-valuable product backlog items, counting unDone work as Done, use of averages Avoid Consider historical reference items but beware of accidental complication Consider Try probabilistic forecasting based on counting valuable product backlog items to Done Try Try #NoEstimates and “rolling wave forecasts” of valuable product backlog items to Done Try For complex work, promote managing expectations about uncertainty over managing expectations about dates Promote
  33. 33. Techniques to challenge forecasts Scenario planning Threatcasting Future Backwards Ritual Dissent Cynefin Reality Tree
  34. 34. Thank you John Coleman @JohnColemanIRL https://linktr.ee/johncolemanxagility https://www.infoq.com/articles/sizing-forecasting- scrum/
  35. 35. Appendix
  36. 36. About me agility chef, executive agility guide, product manager #2 Agile Thinkers 360, Top 50 Agile Leaders Leadershum Flight Levels Coach, ProKanban Professional Kanban Trainer, Scrum.org Professional Scrum Trainer, LeSS Friendly Scrum Trainer author of Kanplexity™, underpinned by Cynefin® creator of Xagility™ co-author of Kanban Guide Host of Xagility™ & Agility Island podcasts Organizer for Meetup LeSS Baku Meetup group which was active during covid - an official scrum.org community and an official LeSS meetup
  37. 37. Ideal time Potential upsides Time is what the customer wants Simple to do Potential downsides When was you last ideal day? Does not include waiting time, the 90+% contributor of how long work takes Doesn’t help infer when the work might be done Supports a people utilization mindset
  38. 38. Cost estimation Potential upsides Time is what the customer wants Simple to do Estimating the number of sprints could be useful for commercial bids for example Potential downsides Less useful for actionable Product Backlog Items that would go into a sprint
  39. 39. Three point (min, mid, max) Potential upsides Reveals some of the uncertainty Room for optimists and pessimists Does not use averages Can be used for number of items, number of story points, ideal time, reference items Waiting time is included in our memory of how long it takes Simple to do Can be converted to story points Potential downsides Average performance often used against mix/max sizes for forecasting afterwards Only for the team Prone to inflation Can be converted to story pointsJ
  40. 40. Utilization of people Credit - the LeSS Company - LeSS.works
  41. 41. Utilization of people Credit - the LeSS Company - LeSS.works
  42. 42. Get stronger flow... without adding more people Better to have slack than overwhelm, so people have time to help each other Split items into smaller but still valuable items when needed Show empathy within the workflow, but also upstream and downstream Look after aging • unblock, focus, finish / cancel • do ensemble work Don’t forget to feed the system Lower aging => Lower cycle times => after a time-lag ... More stable throughput… then higher throughput Prioritize within throughput, adjust for noise
  43. 43. Sizing is devalued by •Not having caveats associated with the start date, e.g., nine weeks from the date we start •Not recognizing the amount of work in progress and the progress (or not) of that work •The severity of impediments •Not ordering items higher up the Product Backlog according to delivery risk •A sub-optimal approach to handling dependencies •Confusing outputs with outcomes; a customer/end- user outcome is a change in customer/end-user behavior •Not engaging in discovery activities when the risk of not harvesting potential value is high, compounded by assuming that every item moves from discovery to delivery •Delusions of accuracy and pursuing more accuracy
  44. 44. Other sizing sub-optimal trends •Size per skill - typically caused by focus on resource efficiency over flow efficiency •Size inflation. In extreme cases, I refer to this it’s as bingo •Not taking quality seriously- typically caused by pressure for more "velocity" •Not taking the customer seriously •Size normalization across teams •Counting complete but fake product backlog items, items that don't deliver value, as throughput •Not focusing, not finishing •Delusions of predictability for work that is uncomparable with work from the past •Lack of discovery to find the items we maybe should not build; if we run low-cost experiments, we might fall upon better ideas
  45. 45. Community opinions on Monte Carlo simulations Communities are not aligned on this approach. One project is only executed once While probabilities may help inform decisions, the problem is that they don't make the decision any easier Estimation is often used as a proxy for a decision (should we do this project or not?) The reasons for using estimates differ from probabilistic forecasting. I have seen many probabilistic forecasts based on guesstimates and a lack of history, yet they were not far off in the end
  46. 46. Often there is another question behind the question “when will it be done”, such as: •How can I transfer worry to someone else? •What progress is being made? •What risks remain? •When will we get some return on this investment? •What trade-offs can we tolerate regarding which work can discover/deliver the potential value, e.g., the 80:20 rule? •What trade-offs can we tolerate in terms of reducing some or all of effectiveness, efficiency, and predictability, e.g., running some experiments? •What progress trade-offs can we tolerate in terms of required "dead work" to avoid execution bias, such as laboratory setup? •How much investment will go into acquiring skills, e.g., education or apprenticeship?
  47. 47. Waiting time Reduced by Working together Leaving slack so people help each other A better visualization of how the work flows Active management of work in progress Flow review & improvement rigor Starting when we have • capacity to start • alignment with our dependency partners • alignment upstream and downstream Increased by High utilization of people Pushing work into the system before capacity allows for anyone who does work on the item, including final review Lower quality of in progress queue management, e.g., re-prioritizing in-progress items based on potential value Lower quality of dependency management / elimination Management of the level of constrained- resource or shared-resource queues
  48. 48. A meta-question of "what does winning the game mean?" is well worth considering. Is the team being given a game it can win? And if the team can win, what are the odds? Probabilistic forecasts can help, e.g., Monte- Carlo simulations Despite the hazards, people fear that stakeholders will make up arbitrarily fixed undoable dates in a vacuum Sometimes teams want to attain a ballpark date range to get ahead of stakeholder expectations Interestingly, most of us can accept a weather forecast that gets updated regularly based on the latest information

×