Publicité
Publicité

Contenu connexe

Publicité

Future Friday 201909

  1. WELCOME AT AGILE NXT FUTURE FRIDAY #agilenxt #futurefriday
  2. MEASURE THE IMPACT OF COACHING – USING METHODS FROM DATA SCIENCE Pieter Rijken
  3. Sports Laid off coaches in soccer 2010-2014 Short term results Long term results 0 20 40 60 80 Source: https://www.tussendelinies.nl/het-schokeffect-van-de-trainerswissel
  4. Apparently the coach was wrong! .....or not?? .....how can one tell??
  5. Douglas W. Hubbard Hard questions 1. How do you know if it [coaching] works? 2. Can someone [in the org] tell if it doesn’t work? 3. What is the impact if it doesn’t work?
  6. What is the impact if it doesn’t work? By Power.corrupts - Own work, Public Domain, https://commons.wikimedia.org/w/index.php?curid=4769885 Known as Iatrogenesis “brought forth by the healer”
  7. Any parallel with Agile coaching? ...or interventions in general? ...or ‘improvement’ actions?
  8. We need to compare against a baseline ...for example no coaching ...or no interventions
  9. Compared to the baseline, the effect of coaching can be 1. Worse 2. Similar 3. Better Perhaps coaching leads to worse results and no one notices! Or.....coaching works and the organization concludes it doesn’t!
  10. Virginia Satir Change Model
  11. Demand for Agile coaches since 2006 Source: https://www.itjobswatch.co.uk/jobs/uk/agile%20coach.do
  12. We need to know when... ...coaching / interventions work ...when they do not work ...who can tell us?
  13. State of Agile Report 2017 (Version ONE) • 65% using Agile practices for 3+ years • 84% at or below ‘still maturing level’ • Agile practices are enabling greater adaptability to market conditions 4% • 98% report success in Agile projects • 74% indicate more than half of Agile projects are success • No mention of coaching effectiveness
  14. Organizational Behavior Management behavior actions business results Coaching Measure!
  15. What’s important to your customer? • Crisp Scrum check list • Kanban Litmus test Source: Essential Kanban Condensed Source: https://www.crisp.se/wp-content/uploads/2012/05/Scrum-checklist.pdf
  16. Services Black Box [Way of Working] • Number of customer requests (stories, features, products, releases, ...) fulfilled • Number of projects, marketing campaigns, a/b tests, ... Delivery Rate
  17. Delivery rate • important to the customer • as seen by the customer • very simple to measure • not too many data points needed to get started • look at the team / department / organization as a block box
  18. Incoming emails 6 3 1 0 8 10 5 6 5 7 10 4 3 8 9 8 5 2 1 6 5 4 4 5 7 6 4 2 4 5 0 2 4 6 8 10 12 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Count Week number RECEIVED EMAILS JAN-JUL 2019 • Jan – Jul 2019 • Pattern or totally random? Average = 5.0 emails per week
  19. Pattern or totally random? • It seems reasonable to assume: • Independent (no conversations) • Constant average rate • Known as Poisson distribution
  20. Examples that are Poisson distributed • Number of emails received per day (or per week) • Number of cars passing a certain point • Number of trains passing a station • Number of goals in a soccer match • Number of phone calls per day • Number of customer requests completed per week • Very relaxed conditions!
  21. Let’s try!! Hmmmm....
  22. Uncertainty expressed with bars
  23. Summary • 153 emails in the period January – July 2019 • Grouped per week results in 30 data points • In a histogram we are left with 10 data points • Average rate between 4.8 and 5.2 • Despite variation rate is determined reasonably precise (within 10%)
  24. What to watch out for • Assumptions of the model • independent • constant delivery rate • Bias (unconsciously) built into the model We will use this later on!
  25. Team Mars SCRUM • ...uses Scrum framework with 2-week sprints • ...delivers user stories • ...part of a department of 6 (scrum) teams • ...consists of specialists with slight collaboration • ...does retrospectives every sprint Delivery Rate
  26. Application to team Mars Frequency diagram
  27. Observations • Data consistent with a single Poisson distribution • Stable for almost 1 year • Great for forecasting based on delivery rate of user stories!! • Despite the team doing retrospectives, the data shows no statistically significant improvements as seen from the outside
  28. Emails revisited: August 6 3 1 0 8 10 5 6 5 7 10 4 3 8 9 8 5 2 1 6 5 4 4 5 7 6 4 2 4 5 6 3 1 0 8 10 5 6 5 7 10 4 3 8 9 8 5 2 1 6 5 4 4 5 7 6 4 2 4 5 1 1 10 10 0 2 4 6 8 10 12 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 Count Week number RECEIVED EMAILS JAN-JUL 2019
  29. Let’s try again!! Mathematicians have an indicator (statistic) for ‘how good’ the fit is. Chi-squared per degree of freedom. Ideally equals ~ 1 0.35 for January – July data 2.8 for January – August data Too good to be true Not so good
  30. What to look for in data 6 3 1 0 8 10 5 6 5 7 10 4 3 8 9 8 5 2 1 6 5 4 4 5 7 6 4 2 4 5 6 3 1 0 8 10 5 6 5 7 10 4 3 8 9 8 5 2 1 6 5 4 4 5 7 6 4 2 4 5 1 1 10 10 0 2 4 6 8 10 12 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 Count Week number RECEIVED EMAILS JAN-JUL 2019 (a) change in delivery rate or (b) violation of independence
  31. Method • Use the delivery rate data • Assume à priori a model(eg Poisson distribution) to correctly describe the data • Discover at what points in time this assumption is incorrect
  32. Team Jupiter Towards SCRUM • ...uses Scrum framework with 2-week sprints • ...delivers tasks • ...part of a larger tribe • ...consists of specialists • ...does retrospectives every sprint Delivery Rate Tasks Epics
  33. Histogram Data does not look like Poisson! Indeed: chi-squared > 750K Binning
  34. Histogram Periods of stability: • January – May [rate of 3.5] • October – December [rate of 1.2] • July – September [rate of 3.3] Transition period in December-January MayJul Start of new organization structure Impactful changes after ‘chaotic’ period
  35. Split in periods Jul - May Jul - Sep Oct - Dec Jan - May
  36. Observations • Periods of change are recognized • Helps to choose sequential subsets of data without trending • In hindsight, transitions between periods correspond to major events of change for the team • Periods of stability (no significant change) are long; order of months
  37. How long for change to be noticeable?
  38. Takeaways • Supports coaches (and scrum masters) in learning which interventions are effective and which are not, • Compatibility of delivery rate as an indicator of significant changes • Simple to use....you probably already have the data! • Be patient! • Beware of time delays in interventions • Tool used for this presentation is online available as Jupyter Notebook https://hub.docker.com/r/pietertje/chi2fit
  39. Recipe to get started tomorrow • Find a success criteria that matters to your customer • Decide how to measure it • Get data & model it • Make a hypothesis (what results do you expect and when do you expect results) • Use statistical tools/tests to determine if change is significant! • Test the hypothesis and learn from it!
  40. THANK YOU! WHAT IS YOUR NEXT STEP IN AGILITY? powered by

Notes de l'éditeur

  1. Hand-out: link naar tool overzicht: hoe je dit doet Inhoud: belangrijkste slides met hoofdpunten ‘Screenshots’ van de berekeningen om het zelf te doen vuistregels en richtlijnen links naar tools links als je meer achtergrond wilt weten
  2. Agile is all about focus on creating value for the customer in a sustainable way. Actions that lead to business results and happier customers are a consequence of the behaviour of people. Agile coaching supports this by providing insights to people and the organization so they can choose what behaviour to change and how. This new behaviour will lead to improved business results and satisfied customers, or it leads to a more sustainable way - for the organisation - to achieve the business results. How effective is the coaching and does it ultimately lead to changed improved business results? In this session Pieter demonstrates one way of linking the team actions to observed change in result as seen by the customer. This is demonstrated using data and methods taken from data science.
  3. 213 soccer coaches laid-off between 2010-2014 On the short term, when comparing the results of 5 games before and after the change of coach, on average the team scored 3 points more. However, when comparing over 20 games before and after, teams score – on average – 4 more points.
  4. Apparently the soccer clubs concluded that the coach was to blame for the bad results. Long term results shows that this picture is too naïve.
  5. In his book ‘The failure of risk management’ Douglas W. Hubbard asks 3 questions. Let’s apply these same questions in the context of coaching/intervention methods.
  6. Harm (deaths in childbed) done by good intention (introduction of pathological anatomy) in a hospital in Vienna during 1824-1848.
  7. Is there a parallel with Agile coaching or interventions in general? Are they all harmless or could they do damage? Possible ‘harm’ caused by interventions (and agile coaching): delay of change initiatives caused loss in revenue, ‘longer’ j-curve, loss of trust of the people of the organization, talent moving to other organizations.
  8. To know the effect of coaching/interventions we need to compare to a baseline. One such baseline might be: doing nothing.
  9. Douglas Hubbard mentions 5 possible outcomes; for simplicity we use 3.
  10. J-curve -=> wrong /bad interventions, or too many interventions at once cause delays to reach break-even point
  11. Demand for Agile coaching in the UK has increased dramatically since 2012-2014. So we better know whether it works and leads to improvements that make organizations fitter for their purpose.
  12. 65% uses 3+ years ===> still only 4% business agility 84% still maturing practices ------- 98% report success in agile projects No mention of coaching effectiveness WE NEED A GOOD MEASURE OF SUCCESS
  13. In OBM – very short intro – behaviour leads to actions which lead to business results. The latter we can measure. Coaching and interventions (including retrospective actions for example) result in behaviour changed. Therefor, coaching leads to a change in business results and is measruable.
  14. Let’s look at the service a team / department / organization offers to its customers. Look at what is important to your customer. For the purpose of this presentation we will stick to ‘delivery rate’, but any other (combination) of relevant performance indicator can be taken. Preference is for variables that are important to your customer. This also makes the approach less sensitive to gaming.
  15. Mail per week jan 1- 7 => 6 8-14 => 3 15-21 => 1 22-28 => 0 10 feb 29- 4 => 8 5-11 => 10 12-18 => 5 19-25 => 6 29 mrt 26- 4 => 5 5-11 => 7 12-18 => 10 19-25 => 4 26 apr 26- 1 => 3 2- 8 => 8 9-15 => 9 16-22 => 8 23-29 => 5 33 mei 30- 6 => 2 7-13 => 1 14-20 => 6 21-27 => 5 14 jun 28- 3 => 4 4-10 => 4 11-17 => 5 18-24 => 7 20 jul 25- 1 => 6 2- 8 => 4 9-15 => 2 16-22 => 4 21 23-29 => 5 aug 30- 5 => 1 6-12 => 1 13-19 => 10 20-26 => 10 22 sep 27- 2 => 6 3- 9 => 2+ rate = (98 + 77)/8 = 175/8 = 22 per maand = 5.5 per week = 0.8 per kalenderdag = 1.1 per werkdag x xxx x xxx x x xxx x x xxxxxxxx x xxxxxxxxxxx 012345678911111111112 01234567890
  16. These are all just as likely as the distribution for the emails that we actually have measured. These variations that all correspond to histograms that are all consistent with originating from the same underlying probability distribution, are expressed as error bars. See next slide.
  17. [Left picture] The curve passes through the error bars. [Right picture] Second thing to notice is the relative small range in possible Poisson distributions. Although we have a small data set of (10 data points in the histogram), we have added information by (a) independence and (b) constant delivery rate. This restricts the possible Poisson distributions that are consistent with the data.
  18. Data taken from the 2-weekly Sprint Reviews: the number of accepted user stories by the product owner. The red numbers in the blue area above correspond to the number of completed and accepted backlog items by the product owner. The entire data (as presented in the histogram) is described by a Poisson distribution. This implies that in the entire period October 2016 – September 2017 the delivery rate hasn’t significantly changed.
  19. Data for the month August has been added.
  20. For the goodness of fit we use the Chi-squared statistic. The data including the month August is not well described by a Poisson distribution. Compared to the data set for January – July, the Chi-squared statistic jumps from 0.35 to 2.8 when the month August is included. A sign that probably the delivery rate has changed!
  21. Binning is an important tool for data analysis before fitting the data to the model.
  22. Splitting the histogram for the entire time period (top histogram) we obtain 3 histograms that each more resemble a Poisson distribution that the top one.
Publicité