Learning From Incidents at Autotrader

•Télécharger en tant que PPTX, PDF•

2 j'aime•698 vues

Incident Reviews for a Learning Organisation We all aspire to have a culture of learning and continuous improvement in our teams and organisations but learning and improving when things go wrong is far from easy. When dealing with the fallout from failure - Incident reviews, Incident reports, investigations etc. - the way in which we respond to is a crucial to improving safety and the performance of our organisations. Andy will talk about how Major Incident Reviews are run in IT Operations at Auto Trader. He’ll discuss what works well for them and will bring together practical advice from industry experts for creating a culture of safety and learning. Andy will also cover what mistakes they’ve made, what to avoid and the factors that can prevent learning.

Technologie

Learning from Incidents
at Auto Trader
@4ndyHumphrey

Learning from Failure
at Auto Trader
@4ndyHumphrey

 What is a Learning Organisation?
 What is the Reality?
 What are my Choices?
 Incident Reviews - things to Avoid
 Incident Reviews - things to Encourage
 What about holding people to Account?
 A bit on Our process
Learning from Incidents

Our People
PRIVATE Car Sellers
Trade Car Dealers
30,000
15,000
Auto Trader Staff
Product & Tech Teams
850
275
Our Customers

Our Technology Platform
1.2 billion page views per month
70 million peak page views per day
15 million unique visitors per month
Supported by 100 live applications

Further Reading up front
Links:
John Allspaw - The Infinite Hows
Steve Shorrock - if it werent for the people
EuroControl - Systems Thinking for Safety
Lyndsay Holmwood - Blame-Language-Sharing
Sydney Dekker - Just Culture
Black Box Thinking –
Matthew Syed
People:
Steven Shorrock
Erik Hollnagel
Sidney Dekker
Matthew Syed
John Allspaw
Lindsay Holmwood
Dave Zwieback
Nancy Leveson
Field Guide to
Understanding
Human Error –
Sidney Dekker
Beyond Blame –
Dave Zwieback Nancy Leveson -
Engineering a Safer
World
Further Reading up front

Moral Responsibility
Job Satisfaction
Economic Imperative
Why should I want to learn?

Blame – Ignoring context
Jonathan Caramanus/Green Renaissance/wwf.org.uk

Things will always go wrong
https://www.youtube.com/watch?v=EvegBo4TUdQ

“Blame is the enemy of safety…”
But it is a choice:
Nancy Leveson
W. Edwards Deming
“Whenever there is fear, you
will get wrong figures.”

Environment
Capabilities
Behavior
Values and Beliefs
Identity
Contexts – WHERE?
Methods, Approaches – HOW?
Skills and Actions – WHAT?
Motivation and permission - WHY?
Sense of Self, Role– WHO?
Questioning styles:
Dilts Model

Don’t go too Deep!
Environment
Capabilities
Behavior
Values and Beliefs
Identity
Contexts – WHERE?
Methods, Approaches – HOW?
Skills and Actions – WHAT?
What is important/true – WHY?
Sense of Self – WHO?
Dilts Model

Incident Reviews:
How to encourage learning

Incident Review Prompts
(from The Field Guide To Understanding Human Error, by Sidney Dekker)
At each juncture in the sequence of events (if that is how you want to structure this part of the accident story), you want to get to
know:
• Which cues were observed (what did he or she notice/see or did not notice what he or she had expected to notice?)
• What knowledge was used to deal with the situation? Did participants have any experience with similar situations that was useful in dealing with this
one?
• What expectations did participants have about how things were going to develop, and what options did they think they have to influence the course
of events?
• How did other influences (operational or organizational) help determine how they interpreted the situation and how they would act?
Here are some questions Gary Klein and his researchers typically ask to find out how the situation looked to people on the inside at each of the critical
junctures:
Debriefings need not follow such a scripted set of questions, of course, as the relevance of questions depends on the event. Also, the questions can come across
to
participants as too conceptual to make any sense. You may need to reformulate them in the language of the domain.
Cues What were you seeing?
What were you focusing on?
What were you expecting to happen?
Interpretation If you had to describe the situation to your colleague at that point, what would you have told?
Errors What mistakes (for example in interpretation) were likely at this point?
Previous
experience/knowledge
Were you reminded of any previous experience?
Did this situation fit a standard scenario?
Were you trained to deal with this situation?
Were there any rules that applied clearly
here?
Did any other sources of knowledge suggest what to do?Goals What were you trying to achieve?
Were there multiple goals at the same time?
Was there time pressure or other limitations on what you could do?
Taking action How did you judge you could influence the course of events?
Did you discuss or mentally imagine a number of options or did you know straight away what to do?
Outcome Did the outcome fit your expectation?
Did you have to update your assessment of the situation?
Communications What communication medium(s) did you prefer to use? (phone, chat, email, video conf,
etc.?) Did you make use of more than one communication channels at once?
Help Did you ask anyone for help?
What signal brought you to ask for support or assistance?
Were you able to contact the people you needed to
contact?

Timelines
14:00 Alert
received from
Site confidence
15:15 Incident
communication
sent
16:00 Incident
closure comms
sent
1. Factual timeline entries
can be filled in prior to the
Review Meeting

Major Incidents
High Severity Incidents
Failed Releases (all)
Failed Changes (Large)
Our Process

We understand and truly believe that everyone did
the best job they could, given what they knew at the
time, their skills and abilities, the resources
available, and the situation at hand
We are here to learn and find solutions to improve
our ways of working
Why we are here:

Open Minded
Go back in time
No single ‘Root Cause’
How not Why
Things that help us learn

• Blaming people
• Human Error
• Arse Covering
• Points scoring
• ‘Trying Harder’
• Talking over people
Things that stop us learning:

After the review:
• Incident details recorded
• Actions (owners, dates) recorded
• Owned by Service Management Team

Contenu connexe

Similaire à Learning From Incidents at Autotrader

Provoking change in a gold mining company in South Africa

Veranderen

Musst masterclass instantly increased influence hand out

Power2Improve

Collaborative Research The Conference by Media Evolution Malmö

Erika Hall

Do you want to improve the way you engage with your customers on social and prevent shitstorms from setting off? Join this webinar with Nana Dall (Conflict Resolution Advisor) and Katrine Thielke (Digital Advisor) to discover an innovative method for handling criticism and conflicts in the social media ecosystem better. Find out how The Danish Railway manages a social media crisis before going viral through actionable steps. Key takeaways: - Strategic advice on crisis management in the social media ecosystem - Insights on crisis communication practices in the digital space: learn how to improve communication with customers and handle customer complaints better - Key action steps in crisis management and conflict resolution based on key brand cases

The calm before the storm: Action steps in SoMe crisis management

Komfo

As much as we value excellent research, it simply is not enough. Insight must be actionable. Key take-away points include: --Developing practical strategies to ensure, monitor, and measure insight adoption --Understanding how your end-users make decisions --Framing and positioning your insights for the greatest impact Whether you are a client-side manager or an agency researcher, it is only when your insights directly impact the decisions taken by marketers that the true value of your work is realized. Use Andrew’s new frameworks and techniques to make your insights stick; for innovation and for impact.

"From Insights to Action" by Andrew Vincent, a Revelation Great Research Thin...

Revelation Next

Staying Ahead of the Game: The Steps to Effective Crisis Communications Planning Don't wait for a crisis to hit before considering your communications strategy. Getting caught off guard can mean the difference between success and failure, especially if your competitors are quick to respond. Take action today to ensure tomorrow's stability. > Planning for crisis incidents and overcoming resistance > Engaging in rapid response > Putting crisis plans into action > The role of social media in a crisis Moderator: Ted Skinner, Vice President, Public Relations Products, PR Newswire Panelists: Anne Sceia Klein, APR, Fellow PRSA, President, Anne Klein Communications Group, LLC Irv Lipp, Principal, LippService LLC David Weiner, Senior Account Manager, PR Newswire

Crisis Communications Webinar - June 10

Ted Skinner

Team swivel box

Kiran Sahib

Organizational Diagnosis

jim

User Experience Doesn’t Happen on a Screen - It Happens in the Mind. Introduc...

UXPA International

Accountability and Ownership.pdf

Dr. Pratik SURANA

20100811 jwv dommel valley group workshop

DommelValley

Introduction to Evaluation.pptx

ChrisHayes76322

Every decision we make is one made on behalf of your user. How do we know the decisions we make are the right ones? It is time we initiate a conversation: About where we are and where we want to go, about how we define and measure goodness and rightness in the digital realm, about responsibility, about decisions and consequences, about building something bigger than our own apps. It is time we talk about the ethics of web design. This talk introduces a method for ethical decision making in web design and tech. Rather than a wet moralistic blanket covering the fires of creativity, ethics can be the hearth that makes our creative fires burn brighter without burning down the house. Presented at WordCamp Europe 2018: https://2018.europe.wordcamp.org/session/the-ethics-of-web-design/

How to Not Destroy the World - the Ethics of Web Design

Morten Rand-Hendriksen

Marketing vs. IT - Let the Battle Begin

Connect2AMC

Toolkit for Human Centered Design by Radboudumc REshape

Robin Hooijer

Ces 2013 towards a cdn definition of evaluation

CesToronto

Pob stage 1 seminar 3 sdb

Diana Shore

Story Telling EDA 2023

nado-web

Exercise and summary of Critical Thinking.pptx

Michelle Kassorla

Measuring & Maintaining Employee Engagement

People Lab

Similaire à Learning From Incidents at Autotrader (20)

Provoking change in a gold mining company in South Africa

Musst masterclass instantly increased influence hand out

Collaborative Research The Conference by Media Evolution Malmö

The calm before the storm: Action steps in SoMe crisis management

"From Insights to Action" by Andrew Vincent, a Revelation Great Research Thin...

Crisis Communications Webinar - June 10

Team swivel box

Organizational Diagnosis

User Experience Doesn’t Happen on a Screen - It Happens in the Mind. Introduc...

Accountability and Ownership.pdf

20100811 jwv dommel valley group workshop

Introduction to Evaluation.pptx

How to Not Destroy the World - the Ethics of Web Design

Marketing vs. IT - Let the Battle Begin

Toolkit for Human Centered Design by Radboudumc REshape

Ces 2013 towards a cdn definition of evaluation

Pob stage 1 seminar 3 sdb

Story Telling EDA 2023

Exercise and summary of Critical Thinking.pptx

Measuring & Maintaining Employee Engagement

Dernier

With more memory available, system performance of three Dell devices increased, which can translate to a better user experience Conclusion When your system has plenty of RAM to meet your needs, you can efficiently access the applications and data you need to finish projects and to-do lists without sacrificing time and focus. Our test results show that with more memory available, three Dell PCs delivered better performance and took less time to complete the Procyon Office Productivity benchmark. These advantages translate to users being able to complete workflows more quickly and multitask more easily. Whether you need the mobility of the Latitude 5440, the creative capabilities of the Precision 3470, or the high performance of the OptiPlex Tower Plus 7010, configuring your system with more RAM can help keep processes running smoothly, enabling you to do more without compromising performance.

Boost PC performance: How more available memory can improve productivity

Principled Technologies

CNv6 Instructor Chapter 6 Quality of Service

giselly40

This presentations targets students or working professionals. You may know Google for search, YouTube, Android, Chrome, and Gmail, but did you know Google has many developer tools, platforms & APIs? This comprehensive yet still high-level overview outlines the most impactful tools for where to run your code, store & analyze your data. It will also inspire you as to what's possible. This talk is 50 minutes in length.

Powerful Google developer tools for immediate impact! (2023-24 C)

wesley chun

🐬 The future of MySQL is Postgres 🐘

RTylerCroy

Slack Application Development 101 Slides

praypatel2

Automating Google Workspace (GWS) & more with Apps Script

wesley chun

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...

Neo4j

Handwritten Text Recognition for manuscripts and early printed texts

Maria Levchenko

BooK Now Call us at +918448380779 to hire a gorgeous and seductive call girl for sex. Take a Delhi Escort Service. The help of our escort agency is mostly meant for men who want sexual Indian Escorts In Delhi NCR. It should be noted that any impersonator will get 100 attention from our Young Girls Escorts in Delhi. They will assume the position of reliable allies. VIP Call Girl With Original Photos Book Tonight +918448380779 Our Cheap Price 1 Hour not available 2 Hours 5000 Full Night 8000 TAG: Call Girls in Delhi, Noida, Gurgaon, Ghaziabad, Connaught Place, Greater Kailash Delhi, Lajpat Nagar Delhi, Mayur Vihar Delhi, Chanakyapuri Delhi, New Friends Colony Delhi, Majnu Ka Tilla, Karol Bagh, Malviya Nagar, Saket, Khan Market, Noida Sector 18, Noida Sector 76, Noida Sector 51, Gurgaon Mg Road, Iffco Chowk Gurgaon, Rajiv Chowk Gurgaon All Delhi Ncr Free Home Deliver

08448380779 Call Girls In Friends Colony Women Seeking Men

Delhi Call girls

Microsoft's Threat Matrix for Kubernetes helps organizations understand the attack surface a Kubernetes deployment introduces to their environments. This ensures that adequate detections and mitigations are in place. By covering over 40 different attacker techniques, defenders can learn about Kubernetes-specific mitigations and controls to deploy to their environments. In this session, we will explore the MS-TA9013 Host Path Mount technique, which is commonly used by attackers to perform privilege escalation in a Kubernetes cluster. Attendees will learn how attackers and defenders can: * Escape the container's host volume mount to gain persistence on an underlying node * Move laterally from the underlying node into the customer's cloud environment * Analyze Kubernetes audit logs to detect pods deployed with a hostPath mount * Deploy an admission controller that prevents new pods from using a hostPath mount

Breaking the Kubernetes Kill Chain: Host Path Mount

Puma Security, LLC

Axa Assurance Maroc - Insurer Innovation Award 2024

The Digital Insurer

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024

The Digital Insurer

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men

Delhi Call girls

Heather Hedden, Senior Consultant at Enterprise Knowledge, presented “The Role of Taxonomy and Ontology in Semantic Layers” at a webinar hosted by Progress Semaphore on April 16, 2024. Taxonomies at their core enable effective tagging and retrieval of content, and combined with ontologies they extend to the management and understanding of related data. There are even greater benefits of taxonomies and ontologies to enhance your enterprise information architecture when applying them to a semantic layer. A survey by DBP-Institute found that enterprises using a semantic layer see their business outcomes improve by four times, while reducing their data and analytics costs. Extending taxonomies to a semantic layer can be a game-changing solution, allowing you to connect information silos, alleviate knowledge gaps, and derive new insights. Hedden, who specializes in taxonomy design and implementation, presented how the value of taxonomies shouldn’t reside in silos but be integrated with ontologies into a semantic layer. Learn about: - The essence and purpose of taxonomies and ontologies in information and knowledge management; - Advantages of semantic layers leveraging organizational taxonomies; and - Components and approaches to creating a semantic layer, including the integration of taxonomies and ontologies

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf

Enterprise Knowledge

A Call to Action for Generative AI in 2024

Results

Histor y of HAM Radio presentation slide

vu2urc

Tata AIG General Insurance Company - Insurer Innovation Award 2024

The Digital Insurer

Building Digital Trust in a Digital Economy Veronica Tan, Director - Cyber Security Agency of Singapore Apidays Singapore 2024: Connecting Customers, Business and Technology (April 17 & 18, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...

apidays

The presentation explores the development and application of artificial intelligence (AI) from its inception to its current status in the modern world. The term "artificial intelligence" was first coined by John McCarthy in 1956 to describe efforts to develop computer programs capable of performing tasks that typically require human intelligence. This concept was first introduced at a conference held at Dartmouth College, where programs demonstrated capabilities such as playing chess, proving theorems, and interpreting texts. In the early stages, Alan Turing contributed to the field by defining intelligence as the ability of a being to respond to certain questions intelligently, proposing what is now known as the Turing Test to evaluate the presence of intelligent behavior in machines. As the decades progressed, AI evolved significantly. The 1980s focused on machine learning, teaching computers to learn from data, leading to the development of models that could improve their performance based on their experiences. The 1990s and 2000s saw further advances in algorithms and computational power, which allowed for more sophisticated data analysis techniques, including data mining. By the 2010s, the proliferation of big data and the refinement of deep learning techniques enabled AI to become mainstream. Notable milestones included the success of Google's AlphaGo and advancements in autonomous vehicles by companies like Tesla and Waymo. A major theme of the presentation is the application of generative AI, which has been used for tasks such as natural language text generation, translation, and question answering. Generative AI uses large datasets to train models that can then produce new, coherent pieces of text or other media. The presentation also discusses the ethical implications and the need for regulation in AI, highlighting issues such as privacy, bias, and the potential for misuse. These concerns have prompted calls for comprehensive regulations to ensure the safe and equitable use of AI technologies. Artificial intelligence has also played a significant role in healthcare, particularly highlighted during the COVID-19 pandemic, where it was used in drug discovery, vaccine development, and analyzing the spread of the virus. The capabilities of AI in healthcare are vast, ranging from medical diagnostics to personalized medicine, demonstrating the technology's potential to revolutionize fields beyond just technical or consumer applications. In conclusion, AI continues to be a rapidly evolving field with significant implications for various aspects of society. The development from theoretical concepts to real-world applications illustrates both the potential benefits and the challenges that come with integrating advanced technologies into everyday life. The ongoing discussion about AI ethics and regulation underscores the importance of managing these technologies responsibly to maximize their their benefits while minimizing potential harms.

Artificial Intelligence: Facts and Myths

Joaquim Jorge

Finology Group – Insurtech Innovation Award 2024

The Digital Insurer

Dernier (20)

Boost PC performance: How more available memory can improve productivity

CNv6 Instructor Chapter 6 Quality of Service

Powerful Google developer tools for immediate impact! (2023-24 C)

🐬 The future of MySQL is Postgres 🐘

Slack Application Development 101 Slides

Automating Google Workspace (GWS) & more with Apps Script

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...

Handwritten Text Recognition for manuscripts and early printed texts

08448380779 Call Girls In Friends Colony Women Seeking Men

Breaking the Kubernetes Kill Chain: Host Path Mount

Axa Assurance Maroc - Insurer Innovation Award 2024

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf

A Call to Action for Generative AI in 2024

Histor y of HAM Radio presentation slide

Tata AIG General Insurance Company - Insurer Innovation Award 2024

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...

Artificial Intelligence: Facts and Myths

Finology Group – Insurtech Innovation Award 2024

Learning From Incidents at Autotrader

1. Learning from Incidents at Auto Trader @4ndyHumphrey

2. Learning from Failure at Auto Trader @4ndyHumphrey

3.  What is a Learning Organisation?  What is the Reality?  What are my Choices?  Incident Reviews - things to Avoid  Incident Reviews - things to Encourage  What about holding people to Account?  A bit on Our process Learning from Incidents

4. Our People PRIVATE Car Sellers Trade Car Dealers 30,000 15,000 Auto Trader Staff Product & Tech Teams 850 275 Our Customers

5. Our Technology Platform 1.2 billion page views per month 70 million peak page views per day 15 million unique visitors per month Supported by 100 live applications

6. Further Reading up front Links: John Allspaw - The Infinite Hows Steve Shorrock - if it werent for the people EuroControl - Systems Thinking for Safety Lyndsay Holmwood - Blame-Language-Sharing Sydney Dekker - Just Culture Black Box Thinking – Matthew Syed People: Steven Shorrock Erik Hollnagel Sidney Dekker Matthew Syed John Allspaw Lindsay Holmwood Dave Zwieback Nancy Leveson Field Guide to Understanding Human Error – Sidney Dekker Beyond Blame – Dave Zwieback Nancy Leveson - Engineering a Safer World Further Reading up front

7. What is a Learning Organisation?

8. The Loom

9. A Learning Organisation

10. Moral Responsibility Job Satisfaction Economic Imperative Why should I want to learn?

11. What’s the reality?

12. Blame management

13. Blame - Fundamental Attribution Error

14. Blame - Justice

15. Blame - Hindsight

16. Blame – Bad Apple Theory

17. Blame – Ignoring context Jonathan Caramanus/Green Renaissance/wwf.org.uk

18. Blame - It’s Easy

19. What are my choices?

20. Things will always go wrong https://www.youtube.com/watch?v=EvegBo4TUdQ

21. You can blame people…

22. Or say it’s a one off…

23. Or you can look at the context…

24. …Learn and make changes

25. “Blame is the enemy of safety…” But it is a choice: Nancy Leveson W. Edwards Deming “Whenever there is fear, you will get wrong figures.”

26. Incident Reviews: Things to avoid

27. Culture of fear

28. Top down

29. Asking Why?

30. Environment Capabilities Behavior Values and Beliefs Identity Contexts – WHERE? Methods, Approaches – HOW? Skills and Actions – WHAT? Motivation and permission - WHY? Sense of Self, Role– WHO? Questioning styles: Dilts Model

31. Don’t go too Deep! Environment Capabilities Behavior Values and Beliefs Identity Contexts – WHERE? Methods, Approaches – HOW? Skills and Actions – WHAT? What is important/true – WHY? Sense of Self – WHO? Dilts Model

32. Single Root Cause

33. Points scoring

34. Incident Reviews: How to encourage learning

35. Priming

36. Keep an open mind

37. Explore how events unfolded

38. Incident Review Prompts (from The Field Guide To Understanding Human Error, by Sidney Dekker) At each juncture in the sequence of events (if that is how you want to structure this part of the accident story), you want to get to know: • Which cues were observed (what did he or she notice/see or did not notice what he or she had expected to notice?) • What knowledge was used to deal with the situation? Did participants have any experience with similar situations that was useful in dealing with this one? • What expectations did participants have about how things were going to develop, and what options did they think they have to influence the course of events? • How did other influences (operational or organizational) help determine how they interpreted the situation and how they would act? Here are some questions Gary Klein and his researchers typically ask to find out how the situation looked to people on the inside at each of the critical junctures: Debriefings need not follow such a scripted set of questions, of course, as the relevance of questions depends on the event. Also, the questions can come across to participants as too conceptual to make any sense. You may need to reformulate them in the language of the domain. Cues What were you seeing? What were you focusing on? What were you expecting to happen? Interpretation If you had to describe the situation to your colleague at that point, what would you have told? Errors What mistakes (for example in interpretation) were likely at this point? Previous experience/knowledge Were you reminded of any previous experience? Did this situation fit a standard scenario? Were you trained to deal with this situation? Were there any rules that applied clearly here? Did any other sources of knowledge suggest what to do?Goals What were you trying to achieve? Were there multiple goals at the same time? Was there time pressure or other limitations on what you could do? Taking action How did you judge you could influence the course of events? Did you discuss or mentally imagine a number of options or did you know straight away what to do? Outcome Did the outcome fit your expectation? Did you have to update your assessment of the situation? Communications What communication medium(s) did you prefer to use? (phone, chat, email, video conf, etc.?) Did you make use of more than one communication channels at once? Help Did you ask anyone for help? What signal brought you to ask for support or assistance? Were you able to contact the people you needed to contact?

39. Timelines 14:00 Alert received from Site confidence 15:15 Incident communication sent 16:00 Incident closure comms sent 1. Factual timeline entries can be filled in prior to the Review Meeting

40. Timelines 14:00 Alert received from Site confidence 15:15 Incident communication sent 16:00 Incident closure comms sent 1. Factual timeline entries can be filled in prior to the Review Meeting 13:10 Slow server performance observed by BIll 14:20 Bill spoke to John about SC issues and decided to recover DB 15:50 John finished DB recovery 2. As a group, overlay the basic timeline with key decisions and junctures

41. One conversation

42. Actions

43. Impartial facilitator

44. Investigate what went well

45. Practice – make it habit

46. What about holding people to account?

47. Accountability

48. Our process:

49. Major Incidents High Severity Incidents Failed Releases (all) Failed Changes (Large) Our Process

50. Priming – Timeline - Actions

51. We understand and truly believe that everyone did the best job they could, given what they knew at the time, their skills and abilities, the resources available, and the situation at hand We are here to learn and find solutions to improve our ways of working Why we are here:

52. Open Minded Go back in time No single ‘Root Cause’ How not Why Things that help us learn

53. • Blaming people • Human Error • Arse Covering • Points scoring • ‘Trying Harder’ • Talking over people Things that stop us learning:

54. After the review: • Incident details recorded • Actions (owners, dates) recorded • Owned by Service Management Team

55. Further Reading up front Links: John Allspaw - The Infinite Hows Steve Shorrock - if it werent for the people EuroControl - Systems Thinking for Safety Lyndsay Holmwood - Blame-Language-Sharing Sydney Dekker - Just Culture Black Box Thinking – Matthew Syed People: Steven Shorrock Erik Hollnagel Sidney Dekker Matthew Syed John Allspaw Lindsay Holmwood Dave Zwieback Nancy Leveson Field Guide to Understanding Human Error – Sidney Dekker Beyond Blame – Dave Zwieback Nancy Leveson - Engineering a Safer World Further Reading Again

56. Questions?

Notes de l'éditeur

Private Sellers: us selling our Cars Trade Car Dealers 15,000 - Independent dealers, Franchise dealers, Car Supermarkets
Availability at 99.99% supporting products for Consumers, Private Sellers and Trade Retailers. Supporting access across multiple platforms Supporting Commercial and International Autotrader sites e.g. Dealer Websites Automotive leader for dealer websites with just under 5000 dealers’ sites hosted
Peter Senge – 1990 – the Fifth Discipline Learning and transformation are central functions of the organisation – always changing , never steady state. A Learning Organisation is a term given to a company that facilitates the learning of its members and continuously transforms itself. A learning organisation is a place where people are continually discovering how they create their reality. The loss of the stable state means that our society and all of its institutions are in continuous processes of transformation. We cannot expect new stable states that will endure for our own lifetimes. We must learn to understand, guide, influence and manage these transformations. We must make the capacity for undertaking them integral to ourselves and to our institutions. We must, in other words, become adept at learning. We must become able not only to transform our institutions, in response to changing situations and requirements; we must invent and develop institutions which are ‘learning systems’, that is to say, systems capable of bringing about their own continuing transformation. (Schon 1973: 28) http://infed.org/mobi/the-learning-organization/
A story from Toyota’s origins when it used to build automatic looms. Upon hearing that the plans for one of the looms had been stolen, Kiichiro Toyoda is said to have remarked: Certainly the thieves may be able to follow the design plans and produce a loom. But we are modifying and improving our looms every day. So by the time the thieves have produced a loom from the plans they stole, we will have already advanced well beyond that point. And because they do not have the expertise gained from the failures it took to produce the original, they will waste a great deal more time than us as they move to improve their loom. We need not be concerned about what happened. We need only continue as always, making our improvements. The long-term value of an enterprise is not captured by the value of its products and intellectual property but rather by its ability to continuously increase the value it provides to customers—and to create new customers—through Learning and innovation. (Lean Enterprise p18)
And we all do this within our organisations right??? ITIL – continuous improvementDeming Cycle – PDCA OODA DMAIC – Six sigma lean process improvement
Our attitudes, culture and behavior prevent learning
WHY DO WE DO IT??? Fundamental Attribution Error: How do we explain the behavior of others It turns out there is we are biased towards. Explain the behavior of others due to their personality Explain our own behavior as a result of context. We need to overcome this bias to learn from Incidents and other kinds of failure Image http://www.ffxiah.com/forum/topic/26676/fundamental-attribution-error
WHY DO WE DO IT??? We assume that All accidents or incidents require a human mistake The severity of the accident is proportional to the size of the mistake Punishment acts as a deterrent to prevent issues happening in the future. Need for Retributive justice Punishment Deterrent We often diminish the need for restorative justice. Preventing the issue happening again
WHY DO WE DO IT??? Hindsight BIAS - knew-it-all-along effect is the inclination, after an event has occurred, to see the event as having been predictable, despite there having been little or no objective basis for predicting it. Narrative written after the fact Does not make sense England football manager Fabio Capello – From Black Box thinking – Matthew Syed. Came into English football in 2008 – 2012 He introduced a strict regime of diet, rules around lateness, bans for family members from training and tournements He was pretty successful and lots of commentators put this down to
Retributive vs. Restorative Justice This table illustrates the differences in the approach to justice between Retributive Justice and Restorative Justice. As you will see, Restorative Justice is much more community centric and focuses on making the victim whole. Retributive Justice Restorative Justice Crime is an act against the state, a violation of a law, an abstract idea Crime is an act against another person and the community The criminal justice system controls crime Crime control lies primarily in the community Offender accountability defined as taking punishment Accountability defined as assuming responsibility and taking action to repair harm Crime is an individual act with individual responsibility Crime has both individual and social dimensions of responsibility Punishment is effective: Threats of punishment deter crime Punishment changes behavior Punishment alone is not effective in changing behavior and is disruptive to community harmony and good relationships Victims are peripheral to the process Victims are central to the process of resolving a crime. The offender is defined by deficits The offender is defined by capacity to make reparation Focus on establishing blame or guilt, on the past (did he/she do it?) Focus on the problem solving, on liabilities/obligations, on the future (what should be done?) Emphasis on adversarial relationship Emphasis on dialogue and negotiation Imposition of pain to punish and deter/prevent Restitution as a means of restoring both parties; goal of reconciliation/restoration Community on sideline, represented abstractly by state Community as facilitator in restorative process Response focused on offender’s past behavior Response focused on harmful consequences of offender’s behavior; emphasis is on the future Dependence upon proxy professionals Direct involvement by participants
WHY DO WE DO IT??? Bad Apple Theory: Complacency We assume that systems and procedures are safe and reliable It’s only a few ‘bad apples’ http://radar.oreilly.com/2014/11/if-it-werent-for-the-people.html Steve Shorrock Our view is often that the system is basically safe, so long as the human works as imagined. When things go wrong, we have a seemingly innate human tendency to blame the person at the sharp end. We don’t seem to think of that someone – pilot, controller, train driver or surgeon – as a human being who goes to work to ensure things go right in a messy, complex, demanding and uncertain environment.
Work as imagined vs Work as done We don’t understand. Trade offs Completing pressures Conflicting incentives Procedures adapted for real world
Blame is easy It removes accountability from the organisation We don’t need to consider organizational changes, system changes, (difficult things) It removes the need for self criticism It’s cheap It’s quick
Miss Universe 2015 Steve Harvey – veteran TV presenter in America Announced the winner as Columbia and not Miss Phillipines
It’s just one of those things that happen Human Error https://www.linkedin.com/pulse/how-bad-design-wrecked-steve-harveys-universe-eric-Thomas
It’s just one of those things that happen Human Error
Lights, Sounds What was on the card? What was on the teleprompter?
https://www.linkedin.com/pulse/how-bad-design-wrecked-steve-harveys-universe-eric-Thomas
Blame impact: Fewer issues reported Culture of fear Less responsibility taken – safety is someone else’s responsibility to implement. The wrong data – incorrect accounts Dishonesty – distortion Denying error, diminishing the impact
Our attitudes, culture and behavior prevent learning
Often Learnt behavior from Leaders Will prevent learning John Allspaw – Blameless Postmortem – Web Operations We need to find ways to allow practitioners to tell their stories Without fear that there will be retribution In a supportive atmosphere where failure is not stigmatized Where we regularly talk about (celebrate) our mistakes and take ownership of improving things This cycle of name/blame/shame can be looked at like this: Engineer takes action and contributes to a failure or incident. Engineer is punished, shamed, blamed, or retrained. Reduced trust between engineers on the ground (the “sharp end”) and management (the “blunt end”) looking for someone to scapegoat Engineers become silent on details about actions/situations/observations, resulting in “Cover-Your-Ass” engineering (from fear of punishment) Management becomes less aware and informed on how work is being performed day to day, and engineers become less educated on lurking or latent conditions for failure due to silence mentioned in #4, above Errors more likely, latent conditions can’t be identified due to #5, above Repeat from step 1
Need a wide range of review attendees taking actions Trust between engineers taking action (sharp end) and managers (blunt end) Especially important to share these actions across teams, departments, disciplines Things to avoid – managers taking no action, or all the actions!
Refer John Allspaw – Inifinte Hows
DILTS Model – logical levels - -levels of learning and change Useful as a coaching aim How the language you use can affect the impact and depth to which you get a response. Asking Who and Why really probe deep through these logical levels John Allspaw – infinite hows 1. A new release disabled a feature for some customers. WHY? Because a particular server failed 2. Environment – Contexts Behavior – Skills and Actions Capabilities – Methods, Approaches, Strategies Values and Beliefs – What is important and true Identity – You sense of self
Why Asks people to justify their actions Leads to Who
No single root causes with any incident involving complex systems (all our incidents)
Cherry picking of data to prove pre-existing ideas about what happened. WHAT YOU FIND IS WHAT YOU LOOK FOR Points scoring: It’s easy to use examples of when things go wrong to prove a point or win battles with others. This is generally cherry picking of information and unhelpful to us as an organisation. If unchallenged it will lead to more defensiveness, hiding/manipulation of data etc.
Our attitudes, culture and behavior prevent learning
Good psychological effect States what’s expected Frames the conversation Example from Matthew Syed again – priming experiment and walking the corridor Good example – Agile Prime Directive "Regardless of what we discover, we understand and truly believe that everyone did the best job they could, given what they knew at the time, their skills and abilities, the resources available, and the situation at hand."
Open Minded Everyone is expected to come to an incident review keen to learn new information and listen to the experiences and stories of their colleagues. It’s not acceptable to bring your pre-formulated, rigid ideas of what happened/causes/solutions. Explore differences of opinion Listen to peoples stories of how events unfolded
Focus on Going back in Time Understand the nature of the events AS THEY UNFOLDED over time Are we talking about ordinary routine work Special event? Something never seen before? Consider the predictability of the even at the time. what was known at the time.
Go back in time What information was available to you at this point? What cues, what alerts, what information was available What other pressures did you have Time pressure, multiple focusses
Actual vs Ideal Timeline should probably be created by the people attending the incident review, but we’ve amended so that a ‘factual’ bare bones of the time line is pre-populated by the Incident owner prior to the meeting to save time. Some facts can be added to the timeline before the meeting to save time – Duty Manager can collate this data from Chat, Logs, Emails etc. When adding information as a group about decisions made , explore the differences between peoples perception of what happened Be careful not to get trapped into ‘single root cause’ and listen to as many contributing factors as possible.
Actual vs Ideal Some facts can be added to the timeline before the meeting to save time – Duty Manager can collate this data from Chat, Logs, Emails etc. When adding information as a group about decisions made , explore the differences between peoples perception of what happened Be careful not to get trapped into ‘single root cause’ and listen to as many contributing factors as possible.
Ensure everyone gets a chance to speak and be listened to by everyone Need to keep the whole room to one conversation.
Actions shared, visible, completed Don’t always have to have an action! It might be that understanding how colleagues dealt with the incident and learning more about normal working of your organisation is enough.
Are you the right person to run the incident review ?? Are you seen as impartial?? I’ve done this !!! Give example of not doing this right. Are you the right person to lead this Review? Are you independent Are you independent enough to be fair and impartial? And be seen by others as such?
Celebrate the things that went well (timeline shows how well response unfolded, can report on how well people worked together) Not just pat on back Analyse the things that worked well – good decisions that prevented more downtime, how people adapted what they knew to a new situation How can the good patterns be replicated or enhanced even further? If you truly understand what went well and HOW it went well – you can re-produce this is more situations.
Retrospectives Reviews If you only review the most serious of incidents – you will not get the atmosphere right, people will not be used to it People will be defensive
So this is all great, but what about when people need to be held responsible for their actions? Good PDF http://www.saa.com.sg/saaWeb2011/export/sites/saa/en/Publication/downloads/JustCulture_ReportingtheLine_Accountability.pdf Negligence (turning up to work drunk) Malicious damage (intentionally trying to harm people or the organisation) Incompetence (Making stupid mistakes, not following clear procedures) Gross Misconduct e.g. Defined in a nursing malpractice situation, negligence means the following: The doing of something which a reasonably prudent person would not do, or the failure to do something which a reasonably prudent person would do, under circumstances similar to those shown by the evidence. http://ccn.aacnjournals.org/content/23/5/72.full http://www.saa.com.sg/saaWeb2011/export/sites/saa/en/Publication/downloads/JustCulture_ReportingtheLine_Accountability.pdf Accountability is often interpreted as blaming practitioners for mistakes. This creates a conflict between learning and accountability. This paper proposes three simultaneous directions to achieve a Just Culture: not using incident reports as evidence for disciplinary action, deciding and getting broad support for who gets to decide what is acceptable and unacceptable behaviour switching from blame and backward-looking accountability to forward-looking accountability.
We have an poor view of accountability BLAME does not equal accountability Blame has a massive cost Blame limits accountability We need to encourage forward-accountability not using incident reports as evidence for disciplinary action, b) deciding and getting broad support for who gets to decide what is acceptable and unacceptable behaviour and c) switching from blame and backward-looking accountability to forward-looking accountability Cost: The fear of blame, sanction and punishment, however, is known to change the behaviour of practitioners. They might be induced to hide, downplay or redefine incidents, rather than reporting and sharing them (Merry & Smith, 2001), creating a culture of ‘risk secrecy.’ The possibility of disciplinary action (or worse, prosecution) creates a conflict between accountability and learning. Blame is known as the enemy of safety (Leveson, 2011). Forward-looking accountability needs an environment which encourages sharing accounts and takes away the idea of blame
All Major Incidents and High Severity Incidents have a review All failed large changes have a review (including things that should have been large) All failed Releases have a review We use a similar format for Team / Project retrospectives - certainly in atmosphere
All Major Incidents and High Severity Incidnets have a review All failed large changes have a review (including things that should have been large) All failed Releases have a review We use a similar format for Team / Project retrospectives - certainly in atmosphere
Timeline written on wall Paper prompts at the top are the ‘priming’ bit and read out at the start of the meeting They are also emailed with the Incident Review invite. We use a similar format for Team / Project retrospectives - certainly in atmosphere
State what we are here for: State our approach: A note from Martin Fowler on PRIMING http://martinfowler.com/bliki/PrimingPrimeDirective.html
Open Minded Everyone is expected to come to an incident review keen to learn new information and listen to the experiences and stories of their colleagues. It’s not acceptable to bring your pre-formulated, rigid ideas of what happened/causes/solutions. Go back in time It’s critical we avoid using HINDSIGHT – we need to understand what information was available at the time when decisions were made and actions were performed. Best way to do this is put yourself back in time – into the context of how things unfolded. No single ‘Root Cause’ In complex systems (of people interacting with technology) there is never a single root cause. We often have many contributing factors to how events unfold – lots of those contributing factors are present all the time even when things go right. Stopping at one root cause will miss all this information. How not Why Questions that start WHY (or even worse WHO) tend to force people to justify actions, to attribute and apportion blame. WHY focuses the inquiry on people which is not what we want. We want to gather information about how events unfolded – asking HOW thinks appeared, changed, worked, WHAT happened next, WHAT was expected. Questions around HOW, WHAT, WHEN are much more effective for this.
Blaming people: It’s a popular belief that we have basically safe systems and if you just sorted out the behaviour of a few ‘bad apples’ things would be OK. That’s not the way to improve safety and is generally a cop-out. Please see Agile Prime Directive for what we do believe. Human Error: As above, we are all on the same side trying to do the best job we can. Everyone has variable performance and we need systems/processes etc that are better able to accommodate and expect that. Arse Covering: Hiding information that could help us improve as an organisation would a terrible symptom of something that is wrong with our company culture. We need to make every effort to remove fear of judgement/consquences etc. from Incident Reviews. We need everyone to be open and honest. Points scoring: It’s easy to use examples of when things go wrong to prove a point or win battles with others. This is generally cherry picking of information and unhelpful to us as an organisation. If unchallenged it will lead to more denfensiveness, hiding/manipulation of data etc. ‘Trying Harder’: We will never take an action to ‘be more careful’ or ‘try harder’ not to break things. We all try pretty damn hard already and that is never the solution we are looking for. Talking over people: We can only have one conversation at a time if we are to get a shared understanding of what’s happened and what we can do about it.

Learning From Incidents at Autotrader

Recommandé

Recommandé

Contenu connexe

Similaire à Learning From Incidents at Autotrader

Similaire à Learning From Incidents at Autotrader (20)

Dernier

Dernier (20)

Learning From Incidents at Autotrader

Notes de l'éditeur