SlideShare une entreprise Scribd logo
1  sur  30
Predicting
    Recurring
   Crash Stacks
Hyunmin Seo and Sunghun Kim
 The Hong Kong University Of Science And Technology

              September 7th, 2012
      Automated Software Engineering 2012
               Essen, Germany
Recurring Crashes

                     Bug
                    Report
                    52831
                      1


      3.6b3             Patch       3.6b4
Crash Point                     Crash Point
nsXULTreeAccessible::           nsXULTreeAccessible::
GetTreeItemAccessible           GetTreeItemAccessible


                                                    2
Bad Fixes
• Bad fixes comprise
  as much as 9% of all bugs
                      (Gu et al. ICSE 2010)

• 14.8%∼24.4% of fixes for
  post-release bugs are incorrect
                       (Yin et al. FSE 2011)


                                               3
Motivation

• How often do bad fixes occur?
• How can we help to prevent it?




                                   4
Crash Reporting System
        (CRS)




                         5
Mozilla CRS
                    CR




RELEASE
                          CRS
                    CR   SERVER

                             6
Mozilla CRS




 CRS
SERVER




                       7
Mozilla CRS
               Bug
              Report
              #50001
                         Patch File




                NEXT RELEASE    8
How often are bad fixes?

     Crash            Bug
    Reporting       Reporting
     System          System


  19 sub-versions of Firefox 3.6
       70 Bug Reports
       79 Crash Points
                                   9
Have all crashes
disappeared after fixes?

                                   ?
    Crash
    Report


                Bug
               Report
               #5000
                 1


  Before fix       Patch   After fix

                                       10
Recurring Crash Examples
BUGID    CRASH POINT             ver1 ver2 ver3
         nsHtml5ElementName::    3.6.8 3.6.9 3.6.10
538722   initializeStatics        677    0     0
                                 3.6.6 3.6.7 3.6.8
554544   nsTextFrame::Reflow
                                  773   186   497
         nsXULTreeAccessible::   3.6b3 3.6b4 3.6b5
528311   GetTreeItemAccessible     70   168    0




  48.1 % (38/79)
                                                  11
Crash Paths




•   The same crash point but different
    crash paths
•   The fix may miss some paths
                                         12
BUGID      CRASH POINT                                  ver1 ver2 ver3
                                nsXULTreeAccessible::                        3.6b3 3.6b4 3.6b5
                     528311     GetTreeItemAccessible                          70   168    0
                     35                                                175

                     30                                                150




                                                  # of Crash Reports
# of Crash Reports




                     25                                                125

                     20                                                100

                     15                                                75

                     10                                                50

                      5                                                25

                      0                                                  0
                          #1   #2 #3 #4      #5                              #1   #2     #3 #4     #5
                                Sub-group                                              Sub-group

                               3.6b3                                               3.6b4                13
Comment in Bug Report


  “I don’t know how this bit (crash
   trace) got lost from the patch I
 ended up checking in, but it’s pretty
             essential...”

            A comment in bug report #523528

                                              14
Incomplete Fixes

• We call this as incomplete fixes
• “incomplete” in terms of fix
  locations
• How can we help to prevent this?


                                     15
Approach Overview



                        Covered


 Bug
Report
#5000
  1
          Patch File
                         Missing
                                   16
Idea behind Classification



                    A fix has nothing to do
                    if it is not executed

         ✓
             Fix Location

                                        17
Stack Expansion
        L-1 L-2 L-3                            Entry

A
                                                if
                G   H   I          Path 1                Path 2
                                 Block1                   Block2
B                                                      G ( )
                            J        Y ( )             B ( )
                                                       X ( )
    C
                                               Exit


        D               ✓
                                             CFG of A
                        F
            E
                                Covered ( if S F      )
    Crash Stack                 Missing ( otherwise )
                                                                   18
Experimental Design


 RQ1 - How good is the classification?

RQ2 - How can this help developers?


                                  19
Subjects
Name                              Description
Subject                     19 releases of Firefox 3.6
Release Date                  Oct 2009 ~ Mar 2011
Programming Language                 C / C++
LOC                                3.2M ~ 4.4M


Name                                 Value
# of crash buckets                    33
# of total sub-groups                 1159
# of recurring sub-groups             354
                                                         20
Experimental Design


 RQ1 - How good is the classification?

RQ2 - How can this help developers?


                                  21
RQ1 - Prediction Result
  L4        Prediction       Actual
               292            167




         Precision 0.57
         Recall    0.49
       F-measure      0.53
                                      22
RQ1 - Expansion Levels
        0.9
        0.8
        0.7
        0.6
        0.5
Value




        0.4
        0.3         PRECISION
        0.2         RECALL
        0.1         F-MEASURE
         0
              L-0   L-1   L-2   L-3 L-4 L-5       L-7   L-10 L-∞
                                Expansion Level
                                                                   23
Experimental Design


 RQ1 - How good is the classification?

RQ2 - How can this help developers?


                                  24
IEnumConnectionPoints
                               trace a
                                              RQ2 - Case Study
_RemoteNext_Thunk
                                         trace b
     IEnumOleUndoUnits
     _Next_Stub

     nsAccessibleWrap                          nsRootAccessible
     ::Next                                    ::HandleEvent
    nsXULTreeAccessible
    ::GetChildAt               ✓              nsRootAccessible::
                                              HandleEventWithTarget
         First Fix (#528311)
         in 3.6b4                         Second Fix (#528311)
    nsXULTreeAccessible::            ✓ in 3.6b5
    GetTreeItemAccessible
                                  crash point


286 NS_ENSURE_ARG_POINTER(aChild);                 545 *aAccessible = nsnull;
287 *aChild = nsnull;                              546
288                                                547- if (aRow < 0)
289+ if (IsDefunct())                              547+ if (aRow < 0 || IsDefunct ())
290+    return NS_ERROR_FAILURE;                   548     return;
291                                                549
292 PRInt32 childCount = 0;                        550 PRInt32 rowCount = 0;

                                                                                  25
RQ2 - Developer
          Feedback
 Firefox developer emails and mailing lists
   21 responses - 3 very useful, 7 useful
10“It should be an interesting feature 1 not useful
   requested more information, and useful
    like any automation tool. It should make the
   engineering work easier and keep users less
                     annoyed.”

    “The first patch fixed the known steps but
    missed the fact that other routes led to the
    same state inconsistency. ... If you have a
   system that automates that process it would
                indeed be helpful.”                   26
Threats to validity

• The subject is open source software
• Collected crash data might be biased
• Oracle data set is incomplete



                                    27
Discussion – Future Work
                         nsJARInputThunk::EnsureJarStream


                         nsZipReaderCache::GetZip


                         nsJAR::Open


                         nsZipArchive::OpenArchive


       crash point   ✓   nsZipArchive::BuildFileList



539     //-- Read the central directory headers
540     buf = startp + centralOffset;
541+    if (endp - buf < sizeof(PRUint32))
542+        return NS_ERROR_FILE_CORRUPTED;
543     PRUint32 sig = xtolong(buf); // crash point
544     while (sig == CENTRALSIG) {
                                                            28
Related Work
• Crash bucketing
                         (Dang et al., ICSE 2012)

• Post-mortem crash analysis
                   (Manevich et al., FSE 2004)

• Bug fix verification
                           (Gu et al., ICSE 2010)


                                                29
Conclusions
• 48% of fixed crashes in Firefox recurred.

• We present an approach to predict recurring
  crashes

• RQ1 - How good is the classification?
  • Our approach yields reasonable accuracy - 0.57
    precision and 0.49 recall

• RQ2 - How can this help developers?
  • Our case studies and developers’ feedback show the
    idea is useful                                    30

Contenu connexe

Similaire à Predicting Recurring Crash Stacks (ASE 2012)

CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)Sung Kim
 
Madaari : Ordering For The Monkeys
Madaari : Ordering For The MonkeysMadaari : Ordering For The Monkeys
Madaari : Ordering For The MonkeysJ On The Beach
 
Because you can’t fix what you don’t know is broken...
Because you can’t fix what you don’t know is broken...Because you can’t fix what you don’t know is broken...
Because you can’t fix what you don’t know is broken...Marcel Bruch
 
Improving Bug Tracking Systems
Improving Bug Tracking SystemsImproving Bug Tracking Systems
Improving Bug Tracking SystemsRahul Premraj
 
Rainbow Over the Windows: More Colors Than You Could Expect
Rainbow Over the Windows: More Colors Than You Could ExpectRainbow Over the Windows: More Colors Than You Could Expect
Rainbow Over the Windows: More Colors Than You Could ExpectPeter Hlavaty
 
May2010 hex-core-opt
May2010 hex-core-optMay2010 hex-core-opt
May2010 hex-core-optJeff Larkin
 
Duplicate Bug Reports Considered Harmful ... Really?
Duplicate Bug Reports Considered Harmful ... Really?Duplicate Bug Reports Considered Harmful ... Really?
Duplicate Bug Reports Considered Harmful ... Really?Nicolas Bettenburg
 
Practical Differential Fault Attack on AES
Practical Differential Fault Attack on AESPractical Differential Fault Attack on AES
Practical Differential Fault Attack on AESRiscure
 
PVS-Studio. Static code analyzer. Windows/Linux, C/C++/C#. 2017
PVS-Studio. Static code analyzer. Windows/Linux, C/C++/C#. 2017PVS-Studio. Static code analyzer. Windows/Linux, C/C++/C#. 2017
PVS-Studio. Static code analyzer. Windows/Linux, C/C++/C#. 2017Andrey Karpov
 
Dac07
Dac07Dac07
Dac07makro
 
What is new with JavaScript in Gnome: The 2021 edition
What is new with JavaScript in Gnome: The 2021 editionWhat is new with JavaScript in Gnome: The 2021 edition
What is new with JavaScript in Gnome: The 2021 editionIgalia
 
TechTalk5-WhatDoesItTakeToRunLLVMBuildbots.pdf
TechTalk5-WhatDoesItTakeToRunLLVMBuildbots.pdfTechTalk5-WhatDoesItTakeToRunLLVMBuildbots.pdf
TechTalk5-WhatDoesItTakeToRunLLVMBuildbots.pdfxiso
 
Adding a BOLT pass
Adding a BOLT passAdding a BOLT pass
Adding a BOLT passAmir42407
 
Дмитрий Демчук. Кроссплатформенный краш-репорт
Дмитрий Демчук. Кроссплатформенный краш-репортДмитрий Демчук. Кроссплатформенный краш-репорт
Дмитрий Демчук. Кроссплатформенный краш-репортSergey Platonov
 
Predicting Subsystem Defects using Dependency Graph Complexities
Predicting Subsystem Defects using Dependency Graph Complexities Predicting Subsystem Defects using Dependency Graph Complexities
Predicting Subsystem Defects using Dependency Graph Complexities Thomas Zimmermann
 
Cray XT Porting, Scaling, and Optimization Best Practices
Cray XT Porting, Scaling, and Optimization Best PracticesCray XT Porting, Scaling, and Optimization Best Practices
Cray XT Porting, Scaling, and Optimization Best PracticesJeff Larkin
 

Similaire à Predicting Recurring Crash Stacks (ASE 2012) (20)

CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
 
Madaari : Ordering For The Monkeys
Madaari : Ordering For The MonkeysMadaari : Ordering For The Monkeys
Madaari : Ordering For The Monkeys
 
Because you can’t fix what you don’t know is broken...
Because you can’t fix what you don’t know is broken...Because you can’t fix what you don’t know is broken...
Because you can’t fix what you don’t know is broken...
 
ICSE2011_SRC
ICSE2011_SRC ICSE2011_SRC
ICSE2011_SRC
 
Improving Bug Tracking Systems
Improving Bug Tracking SystemsImproving Bug Tracking Systems
Improving Bug Tracking Systems
 
Rainbow Over the Windows: More Colors Than You Could Expect
Rainbow Over the Windows: More Colors Than You Could ExpectRainbow Over the Windows: More Colors Than You Could Expect
Rainbow Over the Windows: More Colors Than You Could Expect
 
May2010 hex-core-opt
May2010 hex-core-optMay2010 hex-core-opt
May2010 hex-core-opt
 
Duplicate Bug Reports Considered Harmful ... Really?
Duplicate Bug Reports Considered Harmful ... Really?Duplicate Bug Reports Considered Harmful ... Really?
Duplicate Bug Reports Considered Harmful ... Really?
 
Practical Differential Fault Attack on AES
Practical Differential Fault Attack on AESPractical Differential Fault Attack on AES
Practical Differential Fault Attack on AES
 
PVS-Studio. Static code analyzer. Windows/Linux, C/C++/C#. 2017
PVS-Studio. Static code analyzer. Windows/Linux, C/C++/C#. 2017PVS-Studio. Static code analyzer. Windows/Linux, C/C++/C#. 2017
PVS-Studio. Static code analyzer. Windows/Linux, C/C++/C#. 2017
 
Dac07
Dac07Dac07
Dac07
 
Debugging TV Frame 0x0C
Debugging TV Frame 0x0CDebugging TV Frame 0x0C
Debugging TV Frame 0x0C
 
Binary Analysis - Luxembourg
Binary Analysis - LuxembourgBinary Analysis - Luxembourg
Binary Analysis - Luxembourg
 
IOS debugging
IOS debuggingIOS debugging
IOS debugging
 
What is new with JavaScript in Gnome: The 2021 edition
What is new with JavaScript in Gnome: The 2021 editionWhat is new with JavaScript in Gnome: The 2021 edition
What is new with JavaScript in Gnome: The 2021 edition
 
TechTalk5-WhatDoesItTakeToRunLLVMBuildbots.pdf
TechTalk5-WhatDoesItTakeToRunLLVMBuildbots.pdfTechTalk5-WhatDoesItTakeToRunLLVMBuildbots.pdf
TechTalk5-WhatDoesItTakeToRunLLVMBuildbots.pdf
 
Adding a BOLT pass
Adding a BOLT passAdding a BOLT pass
Adding a BOLT pass
 
Дмитрий Демчук. Кроссплатформенный краш-репорт
Дмитрий Демчук. Кроссплатформенный краш-репортДмитрий Демчук. Кроссплатформенный краш-репорт
Дмитрий Демчук. Кроссплатформенный краш-репорт
 
Predicting Subsystem Defects using Dependency Graph Complexities
Predicting Subsystem Defects using Dependency Graph Complexities Predicting Subsystem Defects using Dependency Graph Complexities
Predicting Subsystem Defects using Dependency Graph Complexities
 
Cray XT Porting, Scaling, and Optimization Best Practices
Cray XT Porting, Scaling, and Optimization Best PracticesCray XT Porting, Scaling, and Optimization Best Practices
Cray XT Porting, Scaling, and Optimization Best Practices
 

Plus de Sung Kim

DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence LearningDeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence LearningSung Kim
 
Deep API Learning (FSE 2016)
Deep API Learning (FSE 2016)Deep API Learning (FSE 2016)
Deep API Learning (FSE 2016)Sung Kim
 
Time series classification
Time series classificationTime series classification
Time series classificationSung Kim
 
Tensor board
Tensor boardTensor board
Tensor boardSung Kim
 
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...Sung Kim
 
Heterogeneous Defect Prediction (

ESEC/FSE 2015)
Heterogeneous Defect Prediction (

ESEC/FSE 2015)Heterogeneous Defect Prediction (

ESEC/FSE 2015)
Heterogeneous Defect Prediction (

ESEC/FSE 2015)Sung Kim
 
A Survey on Automatic Software Evolution Techniques
A Survey on Automatic Software Evolution TechniquesA Survey on Automatic Software Evolution Techniques
A Survey on Automatic Software Evolution TechniquesSung Kim
 
Crowd debugging (FSE 2015)
Crowd debugging (FSE 2015)Crowd debugging (FSE 2015)
Crowd debugging (FSE 2015)Sung Kim
 
Software Defect Prediction on Unlabeled Datasets
Software Defect Prediction on Unlabeled DatasetsSoftware Defect Prediction on Unlabeled Datasets
Software Defect Prediction on Unlabeled DatasetsSung Kim
 
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)Sung Kim
 
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)Sung Kim
 
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...Sung Kim
 
Source code comprehension on evolving software
Source code comprehension on evolving softwareSource code comprehension on evolving software
Source code comprehension on evolving softwareSung Kim
 
A Survey on Dynamic Symbolic Execution for Automatic Test Generation
A Survey on  Dynamic Symbolic Execution  for Automatic Test GenerationA Survey on  Dynamic Symbolic Execution  for Automatic Test Generation
A Survey on Dynamic Symbolic Execution for Automatic Test GenerationSung Kim
 
Survey on Software Defect Prediction
Survey on Software Defect PredictionSurvey on Software Defect Prediction
Survey on Software Defect PredictionSung Kim
 
MSR2014 opening
MSR2014 openingMSR2014 opening
MSR2014 openingSung Kim
 
Personalized Defect Prediction
Personalized Defect PredictionPersonalized Defect Prediction
Personalized Defect PredictionSung Kim
 
STAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash ReproductionSTAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash ReproductionSung Kim
 
Transfer defect learning
Transfer defect learningTransfer defect learning
Transfer defect learningSung Kim
 
Automatic patch generation learned from human written patches
Automatic patch generation learned from human written patchesAutomatic patch generation learned from human written patches
Automatic patch generation learned from human written patchesSung Kim
 

Plus de Sung Kim (20)

DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence LearningDeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
 
Deep API Learning (FSE 2016)
Deep API Learning (FSE 2016)Deep API Learning (FSE 2016)
Deep API Learning (FSE 2016)
 
Time series classification
Time series classificationTime series classification
Time series classification
 
Tensor board
Tensor boardTensor board
Tensor board
 
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...
 
Heterogeneous Defect Prediction (

ESEC/FSE 2015)
Heterogeneous Defect Prediction (

ESEC/FSE 2015)Heterogeneous Defect Prediction (

ESEC/FSE 2015)
Heterogeneous Defect Prediction (

ESEC/FSE 2015)
 
A Survey on Automatic Software Evolution Techniques
A Survey on Automatic Software Evolution TechniquesA Survey on Automatic Software Evolution Techniques
A Survey on Automatic Software Evolution Techniques
 
Crowd debugging (FSE 2015)
Crowd debugging (FSE 2015)Crowd debugging (FSE 2015)
Crowd debugging (FSE 2015)
 
Software Defect Prediction on Unlabeled Datasets
Software Defect Prediction on Unlabeled DatasetsSoftware Defect Prediction on Unlabeled Datasets
Software Defect Prediction on Unlabeled Datasets
 
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
 
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
 
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
 
Source code comprehension on evolving software
Source code comprehension on evolving softwareSource code comprehension on evolving software
Source code comprehension on evolving software
 
A Survey on Dynamic Symbolic Execution for Automatic Test Generation
A Survey on  Dynamic Symbolic Execution  for Automatic Test GenerationA Survey on  Dynamic Symbolic Execution  for Automatic Test Generation
A Survey on Dynamic Symbolic Execution for Automatic Test Generation
 
Survey on Software Defect Prediction
Survey on Software Defect PredictionSurvey on Software Defect Prediction
Survey on Software Defect Prediction
 
MSR2014 opening
MSR2014 openingMSR2014 opening
MSR2014 opening
 
Personalized Defect Prediction
Personalized Defect PredictionPersonalized Defect Prediction
Personalized Defect Prediction
 
STAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash ReproductionSTAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash Reproduction
 
Transfer defect learning
Transfer defect learningTransfer defect learning
Transfer defect learning
 
Automatic patch generation learned from human written patches
Automatic patch generation learned from human written patchesAutomatic patch generation learned from human written patches
Automatic patch generation learned from human written patches
 

Dernier

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 

Dernier (20)

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 

Predicting Recurring Crash Stacks (ASE 2012)

  • 1. Predicting Recurring Crash Stacks Hyunmin Seo and Sunghun Kim The Hong Kong University Of Science And Technology September 7th, 2012 Automated Software Engineering 2012 Essen, Germany
  • 2. Recurring Crashes Bug Report 52831 1 3.6b3 Patch 3.6b4 Crash Point Crash Point nsXULTreeAccessible:: nsXULTreeAccessible:: GetTreeItemAccessible GetTreeItemAccessible 2
  • 3. Bad Fixes • Bad fixes comprise as much as 9% of all bugs (Gu et al. ICSE 2010) • 14.8%∼24.4% of fixes for post-release bugs are incorrect (Yin et al. FSE 2011) 3
  • 4. Motivation • How often do bad fixes occur? • How can we help to prevent it? 4
  • 6. Mozilla CRS CR RELEASE CRS CR SERVER 6
  • 8. Mozilla CRS Bug Report #50001 Patch File NEXT RELEASE 8
  • 9. How often are bad fixes? Crash Bug Reporting Reporting System System 19 sub-versions of Firefox 3.6 70 Bug Reports 79 Crash Points 9
  • 10. Have all crashes disappeared after fixes? ? Crash Report Bug Report #5000 1 Before fix Patch After fix 10
  • 11. Recurring Crash Examples BUGID CRASH POINT ver1 ver2 ver3 nsHtml5ElementName:: 3.6.8 3.6.9 3.6.10 538722 initializeStatics 677 0 0 3.6.6 3.6.7 3.6.8 554544 nsTextFrame::Reflow 773 186 497 nsXULTreeAccessible:: 3.6b3 3.6b4 3.6b5 528311 GetTreeItemAccessible 70 168 0 48.1 % (38/79) 11
  • 12. Crash Paths • The same crash point but different crash paths • The fix may miss some paths 12
  • 13. BUGID CRASH POINT ver1 ver2 ver3 nsXULTreeAccessible:: 3.6b3 3.6b4 3.6b5 528311 GetTreeItemAccessible 70 168 0 35 175 30 150 # of Crash Reports # of Crash Reports 25 125 20 100 15 75 10 50 5 25 0 0 #1 #2 #3 #4 #5 #1 #2 #3 #4 #5 Sub-group Sub-group 3.6b3 3.6b4 13
  • 14. Comment in Bug Report “I don’t know how this bit (crash trace) got lost from the patch I ended up checking in, but it’s pretty essential...” A comment in bug report #523528 14
  • 15. Incomplete Fixes • We call this as incomplete fixes • “incomplete” in terms of fix locations • How can we help to prevent this? 15
  • 16. Approach Overview Covered Bug Report #5000 1 Patch File Missing 16
  • 17. Idea behind Classification A fix has nothing to do if it is not executed ✓ Fix Location 17
  • 18. Stack Expansion L-1 L-2 L-3 Entry A if G H I Path 1 Path 2 Block1 Block2 B G ( ) J Y ( ) B ( ) X ( ) C Exit D ✓ CFG of A F E Covered ( if S F ) Crash Stack Missing ( otherwise ) 18
  • 19. Experimental Design RQ1 - How good is the classification? RQ2 - How can this help developers? 19
  • 20. Subjects Name Description Subject 19 releases of Firefox 3.6 Release Date Oct 2009 ~ Mar 2011 Programming Language C / C++ LOC 3.2M ~ 4.4M Name Value # of crash buckets 33 # of total sub-groups 1159 # of recurring sub-groups 354 20
  • 21. Experimental Design RQ1 - How good is the classification? RQ2 - How can this help developers? 21
  • 22. RQ1 - Prediction Result L4 Prediction Actual 292 167 Precision 0.57 Recall 0.49 F-measure 0.53 22
  • 23. RQ1 - Expansion Levels 0.9 0.8 0.7 0.6 0.5 Value 0.4 0.3 PRECISION 0.2 RECALL 0.1 F-MEASURE 0 L-0 L-1 L-2 L-3 L-4 L-5 L-7 L-10 L-∞ Expansion Level 23
  • 24. Experimental Design RQ1 - How good is the classification? RQ2 - How can this help developers? 24
  • 25. IEnumConnectionPoints trace a RQ2 - Case Study _RemoteNext_Thunk trace b IEnumOleUndoUnits _Next_Stub nsAccessibleWrap nsRootAccessible ::Next ::HandleEvent nsXULTreeAccessible ::GetChildAt ✓ nsRootAccessible:: HandleEventWithTarget First Fix (#528311) in 3.6b4 Second Fix (#528311) nsXULTreeAccessible:: ✓ in 3.6b5 GetTreeItemAccessible crash point 286 NS_ENSURE_ARG_POINTER(aChild); 545 *aAccessible = nsnull; 287 *aChild = nsnull; 546 288 547- if (aRow < 0) 289+ if (IsDefunct()) 547+ if (aRow < 0 || IsDefunct ()) 290+ return NS_ERROR_FAILURE; 548 return; 291 549 292 PRInt32 childCount = 0; 550 PRInt32 rowCount = 0; 25
  • 26. RQ2 - Developer Feedback Firefox developer emails and mailing lists 21 responses - 3 very useful, 7 useful 10“It should be an interesting feature 1 not useful requested more information, and useful like any automation tool. It should make the engineering work easier and keep users less annoyed.” “The first patch fixed the known steps but missed the fact that other routes led to the same state inconsistency. ... If you have a system that automates that process it would indeed be helpful.” 26
  • 27. Threats to validity • The subject is open source software • Collected crash data might be biased • Oracle data set is incomplete 27
  • 28. Discussion – Future Work nsJARInputThunk::EnsureJarStream nsZipReaderCache::GetZip nsJAR::Open nsZipArchive::OpenArchive crash point ✓ nsZipArchive::BuildFileList 539 //-- Read the central directory headers 540 buf = startp + centralOffset; 541+ if (endp - buf < sizeof(PRUint32)) 542+ return NS_ERROR_FILE_CORRUPTED; 543 PRUint32 sig = xtolong(buf); // crash point 544 while (sig == CENTRALSIG) { 28
  • 29. Related Work • Crash bucketing (Dang et al., ICSE 2012) • Post-mortem crash analysis (Manevich et al., FSE 2004) • Bug fix verification (Gu et al., ICSE 2010) 29
  • 30. Conclusions • 48% of fixed crashes in Firefox recurred. • We present an approach to predict recurring crashes • RQ1 - How good is the classification? • Our approach yields reasonable accuracy - 0.57 precision and 0.49 recall • RQ2 - How can this help developers? • Our case studies and developers’ feedback show the idea is useful 30

Notes de l'éditeur

  1. We were interested in recurring crashes. That is, software crashes again even after bug fixes.Here’s an example. Firefox crashes at this location. This is the name of the function where Firefox crashed.It is also called as crash point. A developer decided to fix this crash. He filed a bug report and made a patch.This patch was included in the next release. However Firefox crashed again at the same location
  2. This problem is called as bad fixes.There is a bug. I make a fix. But the fix itself is buggy or does not remove previous bug perfectly.Gu et al investigated bug databases of Ant, AspectJ, and Rhino projects and found bad fixes..Yin et al investigated 4 large OS bug fixes and found 14..l
  3. We also found similar bad fix problems in crash bug fixes.Then we wanted to know how often are bad fixes in case of crash bug fixes.And isn’t there any way that we can help prevent this?Our work is motivated by these questions
  4. To see how often bad fixes are we investigated crash reporting system.A crash reporting system is an automated system designed to help developers fix crashes.When software crashes a windows pops up and asks you if you would send a crash report.Microsoft, Apple and Mozilla has their own crash reporting system.
  5. Let me first briefly explain about Mozilla crash reporting system.When a software is released people download it and use it.Some of them experience crashes.Then the client part of CRS generates a crash report with important information about crashessuch as crash location, software version, os, hardware information, stack traces etc.Then it sends generated crash reports to a server.
  6. The server receives many crash reports so they group similar crash reports together.This process is called as bucketing.Mozilla groups crash report having the same crash points together.Then developers investigate crash buckets.
  7. Usually they focus on the most frequent crashes first.If he decided to fix a crash bucket he file a bug report, make a patch then the patch will be included in the next release.
  8. To see how often are bad fixes we investigated Mozilla crash reporting system and Bug reporting systemfor 19 sub-versions of Firefox 3.6. And we found 70 bug reports which fixed 79 crash points.
  9. Then we checked if all crashes are gone.We identified two versions. Before patch released and after patch is released.Then we counted the number of crash reports at both versions. This way you can see if crashes are gone.
  10. Here’s a few examples. Firefox crashed here. We found 677 crash reports at this version.Then developer fixed this crash and after fix, we couldn’t find any more crash reports.This is what we’re expecting. It’s a good fix.However these two are bad fix examples. We still could find a large number of crash reports after bug fixes.Overall, among 79 crash points we found more than 48% of crash points are recurring.
  11. Then isn’t there any way we can help prevent this?We investigated crash reports and bug fixes further and found thatThe crash report in a crash bucket have the same crash point buttheir crash paths could be different. Then if the developer missed one path in his fix, the same crash could recur following the missed path.
  12. Let’s see an example.We found 70 crash reports before bug is fixed.We grouped these crash reports according to their crash stacks and counted the number of crash reports at each stackThis is the result. There are 5 unique crash stacks and the bar shows the number of crash reports.Then we did the same thing at this version. Interestingly all the other crash stacks are gone except the second one.It seems the second path is missed from the first fix.If we look at the history of this bug report, after realizing that the crash is not gone the developer reopened this closed bug reportand made another patch. Now after the new patch is released the crash was gone.
  13. Here’s another evidence developers do miss some paths.This is a comment in another bug report.The situation is similar. The developer realized that the crash is not gone so he reopened the bug reportmade another patch and left this comment.So definitely he missed a crash trace.
  14. We call this as incomplete fixesThe fixes are incomplete in terms of fix locations.How can we help this?How about if we can find those missed paths automatically?That can help developer right?
  15. So this is the overview of our approach to find those paths.When a developer makes a patchwe compare the patch with crash reports and classify them as covered or missing.And we present missed crash path to the developer so that he can fix them again.We can divide the process as preprocessing and classification and now I’ll only explain classification
  16. The idea behind the classification is this.Let’s say Firefox started from here and followed this path and crashed here.Now assume a developer changed code here.In the next release This is what is happening to those missed path.So by comparing the fix location and execution path, we can find missed path.
  17. But what we have in crash reports is crash stack not execution trace.So we use crash stack.This is a crash stack. Each circle is a function or stack frame. A called B B called C and it crashed here.
  18. With this approach we designed an experiment.We had two research questions.The first one is how good is...So what is the precision of our classification.We had a problem here. How do we know whether our prediction is correct or not?We don’t have oracle.So instead, what we did is this.If a crash path is really missed from a patch, it may recur in the next release.So we predicted the crash paths classified as missing to recur. And we checked the predictionwith the real crash reports in the next release. We calculated precision and recall.The next question is How can this help developer?For this one we present case studies and developers’ feedback.The first research question we had is “How good is our classification?”How do we know if the classification result is correct or not?To check this, we predicted the crash traces that we classified as “missing” will recur after bug fixes.We evaluated our prediction with the real crash report in Mozilla CRS.We present precision, recall and f-measure for this prediction.The next research question is “How can this help developers?”We present a few case studies with developers’ feedback for this question.
  19. This is the subject we used in our experiment.And this is the number of unique crash stacks in our experiment.We classify each of them as covered or missing.And this is the number of really recurred crash paths among this.
  20. So for the first question.How good is the classification?
  21. At expansion level 4, our approach predicted 292 crash paths to recur and among them 167 crash paths actually recurred.So the precision is 0.57.The precision and recall vary according to the expansion level.
  22. This shows the result at different expansion level.L-0 means no expansion and L-~ means the stack is expanded as much as possible.As expansion level goes higher, the stack becomes lager and it becomes more close to over-approximation of the original execution trace.You can see the precision goes up while recall goes downAlso the accuracy is highly affected by the crash reports we collected in the next release.That set becomes our oracle data set. There are many reasons that this set is incomplete. We only collected crash reports during a limited period of time. Users may not have submitted crash reports.All these affect our prediction result. But our approach shows reasonable accuracy.
  23. OK,The result shows reasonable precision and recall. We can predict either with high precision or high recall.Now, how can this help developers?
  24. Here is a case study.There are two stack trace.The first one is covered, and the second one is missing.Our approach correctly predict the second one will recur.Now look at the two fixes.This is the first fix and this is the second fix.The two fixes are very similar. Both calls IsDefunct and returns.So if the developer knew the existence of this missed path when he made the first fix,he could easily have fixed that too because the two fixes are very similar. You can find more case studies in the paper.
  25. Also, we asked developers if this approach is useful.We briefly explained our approach with a few case studies and send an email to the Firefox developerswho fixed the crashes used in our experiment. We received 21 responses 3 said it is very useful 7 said it is useful and 10 requested more informationlike can you do the experiment to the recent Firefox crash reports?
  26. Only used Mozilla Firefox Only sub-versions of Firefox 3.6 in limited period
  27. In this work we only focused on incomplete fixes.But there are other types of bad fixes which is incorrect fixes.To handle this crash developer fixed at this location so our approach predict this will not recur.However this crash recurred because this fix was wrong.Previously this buf was pointing invalid memory area so developer added this code to check the validity of this variable.But this code was insufficient and Firefox went through this code and crashed again. Later developer added more code.To handle this case we need more rigorous verification technique which is our future work. Our approach can not find such incorrect fixes currently.
  28. There are related work about crashes.It is possible that Firefox crashes at two different locations but have the same root cause.In this case it is better to put those crash reports into the same bucket. Crash bucketing algorithm addresses this issue. Our approach can be more accurate if we have better bucketing algorithm.There is a work trying to find the root cause of crashes by reasoning backward from crash point.Our work is different. Once developer made a fix then we try to verify it.Gu et al also tried to verify bug fixes by generating more inputs that can trigger the same bug.We couldn’t use the same approach in case of crashes because reproducing crashes is very challenging.Instead we used crash stacks to verify bug fixes.