SlideShare une entreprise Scribd logo
1  sur  44
Télécharger pour lire hors ligne
Mistakes were made
                             Selena Deckelmann
                         selena@primeradiant.com
                         Twitter/IRC: @selenamarie
So
  mL
   SeC
    CA
      CL
       Ao
         E0
         2 f1
         ne
             1r2
              0e
               xn
                e c
Failure
LC
 A
 20
   12
“Prevention”
         “Risk management”
          “Risk mitigation”
           “MTBF, MTTR”
        “Success Engineering”
LC
 A
 20
   12
Plan for the worst.
        Minimize risk.
        Fail.
        Recover, gracefully.
LC
 A
 20
   12
“We don’t need a risk
      management plan,” he
      emphatically stated, “because this
      project can’t be allowed to fail.”
                                                   - Jim Hightower,
     http://jimhighsmith.com/2012/01/09/can-do-thinking-makes-risk-
                                           management-impossible/
LC
 A
 20
   12
x
           2
       210
        01
       E
  CAAL
SLC
Failure is an option.
LC
 A
 20
   12
SCIENCE
LC
 A
 20
   12
Dr. Jerker Denrell 
LC
 A
 20
   12
x
           2
       210
        01
       E
  CAAL
SLC
x
           2
       210
        01
       E
  CAAL
SLC
x
           2
       210
        01
       E
  CAAL
SLC
"I think getting two accidents
        of this type at the same time
            is a freak occurrence."
             -David Cunliffe, NZ Communications Minister
LC
 A
 20
   12
x
           2
       210
        01
       E
  CAAL
SLC
“Further damage was incurred
            on Tuesday afternoon and our
            engineers returned to repair
            the damage,” said Virgin Media.
SLC
  CAAL
     01E
       2
     10
        2
        x
Plan for when things fail.
LC
 A
 20
   12
x
           2
       210
        01
       E
  CAAL
SLC
x
           2
       210
        01
       E
  CAAL
SLC
Tales of failure to...
                      Document
                      Test
                      Verify
                      Imagine
                      Implement
So
  mL
   SeC
    CA
      CL
       Ao
         E0
         2 f1
         ne
             1r2
              0e
               xn
                e c
Failure to document.
SLC
  CAAL
     01E
       2
     10
        2
        x
Moving Day




                    Thanks, David Prior!
SLC
  CAAL
     01E
       2
     10
        2
        x
Prevent documentation
                             failures.
                      • Write documentation.
                      • Update documentation.
                      • Make documenting a step in your written
                        process.
                      • Assign a fixed amount of time to that step.
So
  mL
   SeC
    CA
      CL
       Ao
         E0
         2 f1
         ne
             1r2
              0e
               xn
                e c
Documentation tools

                      • Graphic designers. (Pretty wikis. Pretty
                        docs. (Sphinx?) Diagrams.)
                      • Timelines.
                      • Bug tracking.
                      • Ordered todo lists.
So
  mL
   SeC
    CA
      CL
       Ao
         E0
         2 f1
         ne
             1r2
              0e
               xn
                e c
Failure to test.
SLC
  CAAL
     01E
       2
     10
        2
        x
“My first day posing as a sysadmin
        (~1990, no previous training....) I
        deleted all zero length files on a Sun
        workstation.”
LC
 A
 20
   12
Prevent testing failures.

                      • Verify success criteria.
                      • Write tests.
                      • Test with a buddy.
                      • Have a plan.
So
  mL
   SeC
    CA
      CL
       Ao
         E0
         2 f1
         ne
             1r2
              0e
               xn
                e c
Testing tools

                      • Your favorite test framework
                      • Repeatable shell scripts
                      • Staging environments
So
  mL
   SeC
    CA
      CL
       Ao
         E0
         2 f1
         ne
             1r2
              0e
               xn
                e c
Failure to verify.
SLC
  CAAL
     01E
       2
     10
        2
        x
“What does ‘-d’ actually do?”
LC
 A
 20
   12
Prevent verification
                              failures.

                      • Have a plan for things going wrong.
                      • Have a staging environment.
                      • Test your rollback plan, not just your
                        implementation plan.
So
  mL
   SeC
    CA
      CL
       Ao
         E0
         2 f1
         ne
             1r2
              0e
               xn
                e c
Verification tools


                      • Staging environments
                      • Your buddy
So
  mL
   SeC
    CA
      CL
       Ao
         E0
         2 f1
         ne
             1r2
              0e
               xn
                e c
Failure to imagine.
LC
 A
 20
   12
For my group the
          bottom line was
        "don't trust anyone".

                     Thanks, Maggie!
LC
 A
 20
   12
Recover from failures
                          to imagine.
                      • Share your stories of failure.
                      • Talk with people who are different from
                        you.
                      • Act out implementation scenarios.
So
  mL
   SeC
    CA
      CL
       Ao
         E0
         2 f1
         ne
             1r2
              0e
               xn
                e c
Failure to implement.
LC
 A
 20
   12
Re-implement.


                      • Learn from mistakes.
So
  mL
   SeC
    CA
      CL
       Ao
         E0
         2 f1
         ne
             1r2
              0e
               xn
                e c
Reflection.
        (or, the Post-Mortem)
LC
 A
 20
   12
Before

                      • Plan to do a post-mortem.
                      • Document the plan with numbered steps
                        and a timeline.
                      • Test the plan and the rollback plan.
                      • Identify a “point of no return”.
So
  mL
   SeC
    CA
      CL
       Ao
         E0
         2 f1
         ne
             1r2
              0e
               xn
                e c
During

                      • Screen sharing: UNIX screen,VNC, etc.
                      • Chatroom: IRC, AIM, Campfire (scrollback!)
                      • Voice: Campfire, Skype,VOIP, POTS call line
                      • Headsets!
                      • Designated time-keeper.
So
  mL
   SeC
    CA
      CL
       Ao
         E0
         2 f1
         ne
             1r2
              0e
               xn
                e c
After

                      • Documentation updates
                      • Post-mortem to identify areas of success
                        and areas for improvement.
                      • Limit improvements to 1-2 things.
So
  mL
   SeC
    CA
      CL
       Ao
         E0
         2 f1
         ne
             1r2
              0e
               xn
                e c
Plan for the worst.
        Minimize risk.
        Fail.
        Recover, gracefully.
LC
 A
 20
   12
Thanks!                  xn
                        0e
                       1r2
                            ce
                   ne
                   2 f1
                   E0
                 Ao
                CL
              CA
             SeC
            mL
          So
Mistakes were made
                             Selena Deckelmann
                         selena@primeradiant.com
                         Twitter/IRC: @selenamarie
So
  mL
   SeC
    CA
      CL
       Ao
         E0
         2 f1
         ne
             1r2
              0e
               xn
                e c
Photo credits


                      • Flickr: sheepguardingllama
So
  mL
   SeC
    CA
      CL
       Ao
         E0
         2 f1
         ne
             1r2
              0e
               xn
                e c

Contenu connexe

En vedette

Twitter User Hype Cycle
Twitter User Hype CycleTwitter User Hype Cycle
Twitter User Hype CycleJon Gatrell
 
Adobe LiveCycle Data Services
Adobe LiveCycle Data ServicesAdobe LiveCycle Data Services
Adobe LiveCycle Data ServicesMichael Chaize
 
Fm Mc Presentation Ria2008
Fm Mc   Presentation Ria2008Fm Mc   Presentation Ria2008
Fm Mc Presentation Ria2008Michael Chaize
 
Create folder in start Menu in Windows 7
Create folder in start Menu in Windows 7Create folder in start Menu in Windows 7
Create folder in start Menu in Windows 7Neelanjan Bhattacharyya
 
Max2013 rejected apps presentation
Max2013   rejected apps presentationMax2013   rejected apps presentation
Max2013 rejected apps presentationMichael Chaize
 
クロスブラウザ拡張ライブラリExtension.js
クロスブラウザ拡張ライブラリExtension.js クロスブラウザ拡張ライブラリExtension.js
クロスブラウザ拡張ライブラリExtension.js swdyh
 
Australia PowerPoint Content
Australia PowerPoint Content Australia PowerPoint Content
Australia PowerPoint Content Andrew Schwartz
 

En vedette (10)

Twitter User Hype Cycle
Twitter User Hype CycleTwitter User Hype Cycle
Twitter User Hype Cycle
 
Saxion 7 januari 2008
Saxion 7 januari 2008Saxion 7 januari 2008
Saxion 7 januari 2008
 
Adobe LiveCycle Data Services
Adobe LiveCycle Data ServicesAdobe LiveCycle Data Services
Adobe LiveCycle Data Services
 
Fm Mc Presentation Ria2008
Fm Mc   Presentation Ria2008Fm Mc   Presentation Ria2008
Fm Mc Presentation Ria2008
 
Assertiveness
AssertivenessAssertiveness
Assertiveness
 
Create folder in start Menu in Windows 7
Create folder in start Menu in Windows 7Create folder in start Menu in Windows 7
Create folder in start Menu in Windows 7
 
лезин
лезинлезин
лезин
 
Max2013 rejected apps presentation
Max2013   rejected apps presentationMax2013   rejected apps presentation
Max2013 rejected apps presentation
 
クロスブラウザ拡張ライブラリExtension.js
クロスブラウザ拡張ライブラリExtension.js クロスブラウザ拡張ライブラリExtension.js
クロスブラウザ拡張ライブラリExtension.js
 
Australia PowerPoint Content
Australia PowerPoint Content Australia PowerPoint Content
Australia PowerPoint Content
 

Plus de Selena Deckelmann

While we're here, let's fix computer science education
While we're here, let's fix computer science educationWhile we're here, let's fix computer science education
While we're here, let's fix computer science educationSelena Deckelmann
 
Postgres needs an aircraft carrier
Postgres needs an aircraft carrierPostgres needs an aircraft carrier
Postgres needs an aircraft carrierSelena Deckelmann
 
Harder, better, faster, stronger: PostgreSQL 9.1
Harder, better, faster, stronger: PostgreSQL 9.1Harder, better, faster, stronger: PostgreSQL 9.1
Harder, better, faster, stronger: PostgreSQL 9.1Selena Deckelmann
 
Letters from the open source trenches - Postgres community
Letters from the open source trenches - Postgres communityLetters from the open source trenches - Postgres community
Letters from the open source trenches - Postgres communitySelena Deckelmann
 
Own it: working with a changing open source community
Own it: working with a changing open source communityOwn it: working with a changing open source community
Own it: working with a changing open source communitySelena Deckelmann
 
Managing terabytes: When Postgres gets big
Managing terabytes: When Postgres gets bigManaging terabytes: When Postgres gets big
Managing terabytes: When Postgres gets bigSelena Deckelmann
 
Managing terabytes: When PostgreSQL gets big
Managing terabytes: When PostgreSQL gets bigManaging terabytes: When PostgreSQL gets big
Managing terabytes: When PostgreSQL gets bigSelena Deckelmann
 
How a bunch of normal people Used Technology To Repair a Rigged Election
How a bunch of normal people Used Technology To Repair a Rigged ElectionHow a bunch of normal people Used Technology To Repair a Rigged Election
How a bunch of normal people Used Technology To Repair a Rigged ElectionSelena Deckelmann
 
Open Source Bridge Opening Day
Open Source Bridge Opening DayOpen Source Bridge Opening Day
Open Source Bridge Opening DaySelena Deckelmann
 

Plus de Selena Deckelmann (20)

While we're here, let's fix computer science education
While we're here, let's fix computer science educationWhile we're here, let's fix computer science education
While we're here, let's fix computer science education
 
Algorithms are Recipes
Algorithms are RecipesAlgorithms are Recipes
Algorithms are Recipes
 
Hire the right way
Hire the right wayHire the right way
Hire the right way
 
Pg92 HA, LCA 2012, Ballarat
Pg92 HA, LCA 2012, BallaratPg92 HA, LCA 2012, Ballarat
Pg92 HA, LCA 2012, Ballarat
 
Managing terabytes
Managing terabytesManaging terabytes
Managing terabytes
 
Mistakes were made
Mistakes were madeMistakes were made
Mistakes were made
 
Postgres needs an aircraft carrier
Postgres needs an aircraft carrierPostgres needs an aircraft carrier
Postgres needs an aircraft carrier
 
Mistakes were made
Mistakes were madeMistakes were made
Mistakes were made
 
Harder, better, faster, stronger: PostgreSQL 9.1
Harder, better, faster, stronger: PostgreSQL 9.1Harder, better, faster, stronger: PostgreSQL 9.1
Harder, better, faster, stronger: PostgreSQL 9.1
 
How to ask for money
How to ask for moneyHow to ask for money
How to ask for money
 
Letters from the open source trenches - Postgres community
Letters from the open source trenches - Postgres communityLetters from the open source trenches - Postgres community
Letters from the open source trenches - Postgres community
 
Own it: working with a changing open source community
Own it: working with a changing open source communityOwn it: working with a changing open source community
Own it: working with a changing open source community
 
Managing terabytes: When Postgres gets big
Managing terabytes: When Postgres gets bigManaging terabytes: When Postgres gets big
Managing terabytes: When Postgres gets big
 
Managing terabytes: When PostgreSQL gets big
Managing terabytes: When PostgreSQL gets bigManaging terabytes: When PostgreSQL gets big
Managing terabytes: When PostgreSQL gets big
 
Pdxpugday2010 pg90
Pdxpugday2010 pg90Pdxpugday2010 pg90
Pdxpugday2010 pg90
 
Making Software Communities
Making Software CommunitiesMaking Software Communities
Making Software Communities
 
Illustrated buffer cache
Illustrated buffer cacheIllustrated buffer cache
Illustrated buffer cache
 
Bucardo
BucardoBucardo
Bucardo
 
How a bunch of normal people Used Technology To Repair a Rigged Election
How a bunch of normal people Used Technology To Repair a Rigged ElectionHow a bunch of normal people Used Technology To Repair a Rigged Election
How a bunch of normal people Used Technology To Repair a Rigged Election
 
Open Source Bridge Opening Day
Open Source Bridge Opening DayOpen Source Bridge Opening Day
Open Source Bridge Opening Day
 

Dernier

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 

Dernier (20)

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 

Mistakes were made - LCA 2012

  • 1. Mistakes were made Selena Deckelmann selena@primeradiant.com Twitter/IRC: @selenamarie So mL SeC CA CL Ao E0 2 f1 ne 1r2 0e xn e c
  • 3. “Prevention” “Risk management” “Risk mitigation” “MTBF, MTTR” “Success Engineering” LC A 20 12
  • 4. Plan for the worst. Minimize risk. Fail. Recover, gracefully. LC A 20 12
  • 5. “We don’t need a risk management plan,” he emphatically stated, “because this project can’t be allowed to fail.” - Jim Hightower, http://jimhighsmith.com/2012/01/09/can-do-thinking-makes-risk- management-impossible/ LC A 20 12
  • 6. x 2 210 01 E CAAL SLC
  • 7. Failure is an option. LC A 20 12
  • 10. x 2 210 01 E CAAL SLC
  • 11. x 2 210 01 E CAAL SLC
  • 12. x 2 210 01 E CAAL SLC
  • 13. "I think getting two accidents of this type at the same time is a freak occurrence." -David Cunliffe, NZ Communications Minister LC A 20 12
  • 14. x 2 210 01 E CAAL SLC
  • 15. “Further damage was incurred on Tuesday afternoon and our engineers returned to repair the damage,” said Virgin Media. SLC CAAL 01E 2 10 2 x
  • 16. Plan for when things fail. LC A 20 12
  • 17. x 2 210 01 E CAAL SLC
  • 18. x 2 210 01 E CAAL SLC
  • 19. Tales of failure to... Document Test Verify Imagine Implement So mL SeC CA CL Ao E0 2 f1 ne 1r2 0e xn e c
  • 20. Failure to document. SLC CAAL 01E 2 10 2 x
  • 21. Moving Day Thanks, David Prior! SLC CAAL 01E 2 10 2 x
  • 22. Prevent documentation failures. • Write documentation. • Update documentation. • Make documenting a step in your written process. • Assign a fixed amount of time to that step. So mL SeC CA CL Ao E0 2 f1 ne 1r2 0e xn e c
  • 23. Documentation tools • Graphic designers. (Pretty wikis. Pretty docs. (Sphinx?) Diagrams.) • Timelines. • Bug tracking. • Ordered todo lists. So mL SeC CA CL Ao E0 2 f1 ne 1r2 0e xn e c
  • 24. Failure to test. SLC CAAL 01E 2 10 2 x
  • 25. “My first day posing as a sysadmin (~1990, no previous training....) I deleted all zero length files on a Sun workstation.” LC A 20 12
  • 26. Prevent testing failures. • Verify success criteria. • Write tests. • Test with a buddy. • Have a plan. So mL SeC CA CL Ao E0 2 f1 ne 1r2 0e xn e c
  • 27. Testing tools • Your favorite test framework • Repeatable shell scripts • Staging environments So mL SeC CA CL Ao E0 2 f1 ne 1r2 0e xn e c
  • 28. Failure to verify. SLC CAAL 01E 2 10 2 x
  • 29. “What does ‘-d’ actually do?” LC A 20 12
  • 30. Prevent verification failures. • Have a plan for things going wrong. • Have a staging environment. • Test your rollback plan, not just your implementation plan. So mL SeC CA CL Ao E0 2 f1 ne 1r2 0e xn e c
  • 31. Verification tools • Staging environments • Your buddy So mL SeC CA CL Ao E0 2 f1 ne 1r2 0e xn e c
  • 33. For my group the bottom line was "don't trust anyone". Thanks, Maggie! LC A 20 12
  • 34. Recover from failures to imagine. • Share your stories of failure. • Talk with people who are different from you. • Act out implementation scenarios. So mL SeC CA CL Ao E0 2 f1 ne 1r2 0e xn e c
  • 36. Re-implement. • Learn from mistakes. So mL SeC CA CL Ao E0 2 f1 ne 1r2 0e xn e c
  • 37. Reflection. (or, the Post-Mortem) LC A 20 12
  • 38. Before • Plan to do a post-mortem. • Document the plan with numbered steps and a timeline. • Test the plan and the rollback plan. • Identify a “point of no return”. So mL SeC CA CL Ao E0 2 f1 ne 1r2 0e xn e c
  • 39. During • Screen sharing: UNIX screen,VNC, etc. • Chatroom: IRC, AIM, Campfire (scrollback!) • Voice: Campfire, Skype,VOIP, POTS call line • Headsets! • Designated time-keeper. So mL SeC CA CL Ao E0 2 f1 ne 1r2 0e xn e c
  • 40. After • Documentation updates • Post-mortem to identify areas of success and areas for improvement. • Limit improvements to 1-2 things. So mL SeC CA CL Ao E0 2 f1 ne 1r2 0e xn e c
  • 41. Plan for the worst. Minimize risk. Fail. Recover, gracefully. LC A 20 12
  • 42. Thanks! xn 0e 1r2 ce ne 2 f1 E0 Ao CL CA SeC mL So
  • 43. Mistakes were made Selena Deckelmann selena@primeradiant.com Twitter/IRC: @selenamarie So mL SeC CA CL Ao E0 2 f1 ne 1r2 0e xn e c
  • 44. Photo credits • Flickr: sheepguardingllama So mL SeC CA CL Ao E0 2 f1 ne 1r2 0e xn e c