SlideShare une entreprise Scribd logo
1  sur  68
Télécharger pour lire hors ligne
Dealing with
          Noise
          in bug prediction



      Sunghun Kim, Hongyu Zhang,
        Rongxin Wu and Liang Gong
The Hong Kong University of Science & Technology
                          Tsinghua University
Where are the bugs?




         2
Where are the bugs?
Complex files!
 [Menzies et al.]




                    2
Where are the bugs?
                        Modified files!
Complex files!           [Nagappan et al.]
 [Menzies et al.]




                    2
Where are the bugs?
                          Modified files!
Complex files!             [Nagappan et al.]
 [Menzies et al.]




Nearby other bugs!
[Zimmermann et al.]



                      2
Where are the bugs?
                             Modified files!
Complex files!                [Nagappan et al.]
 [Menzies et al.]




Nearby other bugs!        Previously fixed files
[Zimmermann et al.]           [Hassan et al.]



                      2
Prediction model
  training instances
  (features+ labels)




            3
Prediction model
  training instances
  (features+ labels)




       Learner
            3
Prediction model
      training instances
      (features+ labels)




?

           Learner
                3
Prediction model
      training instances
      (features+ labels)




?

           Learner
                3
Prediction model
      training instances
      (features+ labels)




?

           Learner         Prediction
                3
Prediction model
      training instances
      (features+ labels)




?

           Learner         Prediction
                3
Training on software evolution is key

  • Software features can be used to predict bugs
  • Defect labels obtained from software evolution
  • Supervised learning algorithms


          Version                     Bug
          Archive                   Database




                          4
Change classification




5
    Kim, Whitehead Jr., Zhang: Classifying Software Changes: Clean or Buggy? (TSE 2008)
Change classification


    bug-introducing (“bad”)

    X        X              X       X




5
        Kim, Whitehead Jr., Zhang: Classifying Software Changes: Clean or Buggy? (TSE 2008)
Change classification

        BUILD A LEARNER
    bug-introducing (“bad”)

    X         X              X       X




5
         Kim, Whitehead Jr., Zhang: Classifying Software Changes: Clean or Buggy? (TSE 2008)
Change classification

        BUILD A LEARNER
    bug-introducing (“bad”)

    X         X              X       X
                                                                new change




5
         Kim, Whitehead Jr., Zhang: Classifying Software Changes: Clean or Buggy? (TSE 2008)
Change classification

        BUILD A LEARNER
    bug-introducing (“bad”)

    X         X              X       X
                                                                new change


                                          PREDICT QUALITY
5
         Kim, Whitehead Jr., Zhang: Classifying Software Changes: Clean or Buggy? (TSE 2008)
Training Classifiers
             0   1   0   1   0   1   0   1   …   0   1
Historical
changes



             0   0   0   1   0   1   0   1   …   0   0
             0   1   1   1   0   1   1   1   …   0   0
             0   1   0   3   0   0   0   1   …   0   1
             0   1   0   1   0   1   0   1   …   0   0


      § Machine learning techniques
         • Bayesian Network, SVM
Source Repository                                                       Bug Database

                  all commits C
                        commit                                                       all bugs B
             commit                commit




    commit                                      commit




                        commit
                                                                              fixed bugs Bf

                                            commit




               commit




                          commit




7                                                Bird et al. “Fair and Balanced? Bias in Bug-Fix Datasets,” FSE2009
Source Repository                                                         Bug Database

                  all commits C
                        commit                                                       all bugs B
             commit                commit




    commit                                      commit




                        commit
                                                                               fixed bugs Bf

                                            commit




               commit




                          commit



                                                     linked via log messages


7                                                Bird et al. “Fair and Balanced? Bias in Bug-Fix Datasets,” FSE2009
Source Repository                                                          Bug Database

                  all commits C
                        commit                                                        all bugs B
             commit                commit




    commit                                      commit




                        commit
                                                                                 fixed bugs Bf

                                            commit




                                                                               linked fixed bugs Bfl
               commit




                          commit



                                                     linked via log messages


7                                                Bird et al. “Fair and Balanced? Bias in Bug-Fix Datasets,” FSE2009
Source Repository                                                          Bug Database

                  all commits C
                        commit                                                        all bugs B
             commit                commit




    commit                                      commit




                        commit
                                                                                 fixed bugs Bf

                                            commit




              linked fixes Cfl                                                   linked fixed bugs Bfl
               commit




                          commit



                                                     linked via log messages


7                                                Bird et al. “Fair and Balanced? Bias in Bug-Fix Datasets,” FSE2009
Source Repository                                                          Bug Database

                  all commits C
                        commit                                                        all bugs B
             commit                commit




    commit                                      commit




                                                               related,
                        commit
                                                            but not linked       fixed bugs Bf

                                            commit




              linked fixes Cfl                                                   linked fixed bugs Bfl
               commit




                          commit



                                                     linked via log messages


7                                                Bird et al. “Fair and Balanced? Bias in Bug-Fix Datasets,” FSE2009
Source Repository                                                           Bug Database

                  all commits C
                         commit                                                        all bugs B
             commit                 commit




    commit                                       commit



                      bug fixes Cf                               related,
                         commit
                                                             but not linked       fixed bugs Bf

                                             commit




              linked fixes Cfl                                                    linked fixed bugs Bfl
               commit




                           commit



                                                      linked via log messages


7                                                 Bird et al. “Fair and Balanced? Bias in Bug-Fix Datasets,” FSE2009
Source Repository                                                           Bug Database

                  all commits C
                                                              oise!
                                                             N                         all bugs B
                         commit

             commit                 commit




    commit                                       commit



                      bug fixes Cf                               related,
                         commit
                                                             but not linked       fixed bugs Bf

                                             commit




              linked fixes Cfl                                                    linked fixed bugs Bfl
               commit




                           commit



                                                      linked via log messages


7                                                 Bird et al. “Fair and Balanced? Bias in Bug-Fix Datasets,” FSE2009
Effect of training on
     superbiased data (Severity)



                                             Trained on all bugs
                                             Trained on biased data1
                                             Trained on biased data2


    0%   20%      40%              60%              80%              100%

                     Bug Recall
8              Bird et al. “Fair and Balanced? Bias in Bug-Fix Datasets,” FSE2009
Effect of training on
     superbiased data (Severity)



                                             Trained on all bugs
                                             Trained on biased data1
                                             Trained on biased data2


    0%   20%      40%              60%              80%              100%

                     Bug Recall
9              Bird et al. “Fair and Balanced? Bias in Bug-Fix Datasets,” FSE2009
Effect of training on
      superbiased data (Severity)



                                              Trained on all bugs
                                              Trained on biased data1
                                              Trained on biased data2


     0%   20%      40%              60%              80%              100%

                      Bug Recall
10              Bird et al. “Fair and Balanced? Bias in Bug-Fix Datasets,” FSE2009
Effect of training on
      superbiased data (Severity)


           Bias in bug severityon all bugs
                           Trained
            affects BugCache on biased data1
                           Trained
                           Trained on biased data2


     0%   20%       40%              60%              80%              100%

                       Bug Recall
10               Bird et al. “Fair and Balanced? Bias in Bug-Fix Datasets,” FSE2009
Are defect prediction
models learned from
noisy data reliable?

             11
Study questions

• Q1: How resistant a defect prediction
  model is to noise?
• Q2: How much noise could be detected/
  removed?
• Q3: Could we remove noise to improve
  defect prediction performance?


                       12
Study approach




      13
Study approach




      13
Study approach




      13
Study approach
 Training




                 Testing


            13
Study approach
 Training




            Bayes Net   Testing


                 13
Making noisy training instances




              Training        Testing




                14
Making noisy training instances


1 Removing
  buggy labels     False negative noise




                        Training          Testing




                          14
Making noisy training instances


1 Removing
  buggy labels     False negative noise




                        Training          Testing
2 Adding
  buggy labels



                   False positive noise




                          14
Prediction models
                                               buggy
Rev n        Rev n+1
...          ...
...          ...


        change

                                               clean
                        change classification




                                  15
Prediction models
                                                        buggy
Rev n           Rev n+1
...             ...
...             ...


        change

                                                        clean
                             change classification

                                                        buggy

        File
        ...
        ...
        ...
         File
                                                        clean
                          file-level defect prediction
                                        15
Performance evaluation
§ 4 possible outcomes from prediction models
  § Classifying a buggy change as buggy (nb->b)
  § Classifying a buggy change as clean (nb->c)
  § Classifying a clean change as clean (nc->c)
  § Classifying a clean change as buggy (nc->b)

                    nb->b                        nb->b
§ Precision =                    , Recall=
                 nb->b + nc->b                nb->b + nb->c

                   precision ! recall
§ F-measure = 2 !
                   precision + recall
                             16
Subjects
change classification
    subject      # instances        % buggy   # features
   Columba          1,800            29.4%     17,411
 Eclipse (JDT)       659             10.1%     16,192
     Scarab         1,090            50.6%      5,710



file-level defect prediction
   subject       # instances        % buggy   # features
    SWT            1,485             44%         18
    Debug          1,065            24.7%        18

                               17
Experimental results
                    $"
                  !#,"
                  !#+"
                  !#*"
!"##$%&'()*+",)




                  !#)"
                  !#("
                                                                             -./0123"
                  !#'"
                  !#&"                                                       40115"6.7"-./0123"

                  !#%"
                  !#$"
                    !"
                         !"   !#$"   !#%"   !#&"   !#'"   !#("   !#)"
                          -./%0,*1212#%+)3%4*5+)%2)#*67)%-&8/%9%4*5+)%:;+167)%-&</%,*3)
                                                           18
Columba
                    $"
                  !#,"
                  !#+"
                  !#*"
!"##$%&'()*+",)




                  !#)"
                  !#("
                                                                             -./0123"
                  !#'"
                  !#&"                                                       40115"6.7"-./0123"

                  !#%"
                  !#$"
                    !"
                         !"   !#$"   !#%"   !#&"   !#'"   !#("   !#)"
                          -./%0,*1212#%+)3%4*5+)%2)#*67)%-&8/%9%4*5+)%:;+167)%-&</%,*3)
                                                           19
Columba
                    $"
                  !#,"
                  !#+"                                    1. Random guess (50% buggy, 50% clean)
                                                          2. Columba’s defect rate is about 30%
                  !#*"
!"##$%&'()*+",)




                                                          3. Precision = 0.3 and Recall =0.5
                  !#)"                                    4. F-measure = 0.375 (2*0.5*0.3)/(0.3+0.5)
                  !#("
                                                                              -./0123"
                  !#'"
                  !#&"                                                        40115"6.7"-./0123"

                  !#%"
                  !#$"
                    !"
                         !"   !#$"   !#%"   !#&"   !#'"     !#("   !#)"
                          -./%0,*1212#%+)3%4*5+)%2)#*67)%-&8/%9%4*5+)%:;+167)%-&</%,*3)
                                                              19
Columba
                    $"
                  !#,"
                  !#+"
                  !#*"
!"##$%&'()*+",)




                  !#)"
                  !#("
                                                                             -./0123"
                  !#'"
                  !#&"                                                       40115"6.7"-./0123"

                  !#%"
                  !#$"
                    !"
                         !"   !#$"   !#%"   !#&"   !#'"   !#("   !#)"
                          -./%0,*1212#%+)3%4*5+)%2)#*67)%-&8/%9%4*5+)%:;+167)%-&</%,*3)
                                                           20
Eclipse (JDT)
                    $"
                  !#,"
                  !#+"
                  !#*"
!"##$%&'()*+",)




                  !#)"
                  !#("
                                                                               -./0123"
                  !#'"
                  !#&"                                                         45667"89:"-./0123"

                  !#%"
                  !#$"
                    !"
                         !"   !#$"   !#%"    !#&"   !#'"   !#("   !#)"
                          -./%0,*1212#%+)3%4*5+)%2)#*67)%-&8/%9%4*5+)%:;+167)%-&</%,*3)
                                                           21
Scarab
                    $"
                  !#,"
                  !#+"
                  !#*"
!"##$%&'()*+",)




                  !#)"
                  !#("
                                                                               -./0/1"
                  !#'"
                  !#&"                                                         23445"670"-./0/1"

                  !#%"
                  !#$"
                    !"
                         !"   !#$"    !#%"   !#&"   !#'"   !#("   !#)"
                          -./%0,*1212#%+)3%4*5+)%2)#*67)%-&8/%9%4*5+)%:;+167)%-&</%,*3)
                                                           22
Eclipse (Debug)
                     $"
                   !#,"
                   !#+"
                   !#*"
!"##$%&'()*+",)%




                   !#)"
                   !#("
                                                                                              -./01"
                   !#'"
                   !#&"                                                                       -0223"456"-./01"

                   !#%"
                   !#$"
                     !"
                          !"       !#$"      !#%"       !#&"      !#'"       !#("      !#)"
                           -./%0,*1212#%+)3%4*5+)%2)#*67)%-&8/%9%4*5+)%:;+167)%-&</%,*3)%

                                                                  23
Eclipse (SWT)
                     $"
                   !#,"
                   !#+"
                   !#*"
!"##$%&'()*+",)%




                   !#)"
                   !#("
                                                                                              -./"
                   !#'"
                   !#&"                                                                       01223"456"-./"

                   !#%"
                   !#$"
                     !"
                          !"      !#$"       !#%"       !#&"      !#'"       !#("      !#)"
                           -./%0,*1212#%+)3%4*5+)%2)#*67)%-&8/%9%4*5+)%:;+167)%-&</%,*3)%

                                                                 24
Q1: How resistant a defect
                              prediction model is to noise?
                    $"
                  !#,"
                  !#+"
                  !#*"
!"##$%&'()*+",)




                                                                                                      -./"
                  !#)"
                  !#("                                                                                01234"
                  !#'"                                                                                5673829"
                  !#&"
                                                                                                      :;7<=>1"
                  !#%"
                                                                                                      -;9?92"
                  !#$"
                    !"
                         !"         !#$"        !#%"       !#&"        !#'"        !#("        !#)"
                               -./%0,*1212#%+)3%4*5+)%2)#*67)%-&8/%9%4*5+)%:;+167)%-&</%,*3)
                                                                  25
Q1: How resistant a defect
                              prediction model is to noise?
                    $"
                  !#,"
                  !#+"
                  !#*"
!"##$%&'()*+",)




                                                                                                      -./"
                  !#)"
                  !#("                                                                                01234"
                  !#'"                                                                                5673829"
                  !#&"
                                                                                                      :;7<=>1"
                  !#%"
                                                                                                      -;9?92"
                  !#$"
                    !"
                         !"         !#$"        !#%"       !#&"        !#'"        !#("        !#)"
                               -./%0,*1212#%+)3%4*5+)%2)#*67)%-&8/%9%4*5+)%:;+167)%-&</%,*3)
                                                                  26
Q1: How resistant a defect
                              prediction model is to noise?
                    $"
                  !#,"
                  !#+"
                  !#*"
!"##$%&'()*+",)




                                                                                                      -./"
                  !#)"
                  !#("                                                                                01234"
                  !#'"                                                                                5673829"
                  !#&"
                  !#%"
                  !#$"
                                     20~30%                                                           :;7<=>1"

                                                                                                      -;9?92"

                    !"
                         !"         !#$"        !#%"       !#&"        !#'"        !#("        !#)"
                               -./%0,*1212#%+)3%4*5+)%2)#*67)%-&8/%9%4*5+)%:;+167)%-&</%,*3)
                                                                  26
Study questions

• Q1: How resistant a defect prediction
  model is to noise?
• Q2: How much noise could be detected/
  removed?
• Q3: Could we remove noise to improve
  defect prediction performance?


                       27
Detecting noise


1 Removing
  buggy labels       False negative noise




                      Original training
2 Adding
  buggy labels



                     False positive noise




                            28
Detecting noise


    False negative noise




     Original training




    False positive noise




           29
30
ts. However, it is very hard to get a golden set. In our approach,
e carefully select high quality datasets and assume them the
 lden sets. We then add FPs and FNs intentionally to create a
                            False positive noise
 ise set. To add FPs and FNs, we randomly selects instances in a
 lden set and artificially change their labels from buggy to clean
 from clean to buggy, inspired by experiments in [4].
                             Original training
                                      ?
     noise
     Clean                  False negative noise
                     Detecting noise
           F igure 4. C reating biased training set
 make FN data sets (for RQ1), we randomly select n% buggy
return Aj
  Closest 9. Thenoise identification algorit
    F igure
            list pseudo-code of the C LN I



                   A




                   31
Noise detection performance


        Precision        Recall     F-measure

Debug    0.681           0.871        0.764

SWT      0.624           0.830        0.712



                            (noise level =20%)
                    32
Noise detection performance
   1

  0.9

  0.8

  0.7

  0.6

  0.5

  0.4

  0.3                                     Precision

  0.2                                     Recall

  0.1
                                          F-measure

   0
        0.1   0.15   0.2     0.25   0.3        0.35   0.4   0.45   0.5

                           FP & FN noise level
                            Noise Rate
                                     33
Study questions

• Q1: How resistant a defect prediction
  model is to noise?
• Q2: How much noise could be detected/
  removed?
Q3: Could we remove noise to improve
defect prediction performance?


                       34
Bug prediction using cleaned data
                                Noisey          Cleaned

                100



                 75
SWT F-measure




                 50



                 25



                  0
                      0%        15%                       30%   45%

                           35            Noise level
Bug prediction using cleaned data
                                Noisey          Cleaned

                100



                 75
SWT F-measure




                 50



                 25



                  0
                      0%        15%                       30%   45%

                           36            Noise level
Bug prediction using cleaned data
                                 Noisey          Cleaned

                100



                 75
SWT F-measure




                 50



                 25
                           76%
                             F-measure
                           with 45% noise
                  0
                      0%         15%                       30%   45%

                            36            Noise level
Study limitations

• All datasets are collected from open source
  projects
• The golden set used in this paper may not be
  perfect
• The noisy data simulations may not reflect
  the actual noise patterns in practice


                      37
Summary

• Prediction models (used in our experiments)
  are resistant (up to 20~30%) of noise
• Noise detection is promising
• Future work
  - Building oracle defect sets
  - Improving noise detection algorithms
  - Applying to more defect prediction models
    (regression, bugcache)

                        38

Contenu connexe

Plus de Sung Kim

REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...Sung Kim
 
Heterogeneous Defect Prediction (

ESEC/FSE 2015)
Heterogeneous Defect Prediction (

ESEC/FSE 2015)Heterogeneous Defect Prediction (

ESEC/FSE 2015)
Heterogeneous Defect Prediction (

ESEC/FSE 2015)Sung Kim
 
A Survey on Automatic Software Evolution Techniques
A Survey on Automatic Software Evolution TechniquesA Survey on Automatic Software Evolution Techniques
A Survey on Automatic Software Evolution TechniquesSung Kim
 
Crowd debugging (FSE 2015)
Crowd debugging (FSE 2015)Crowd debugging (FSE 2015)
Crowd debugging (FSE 2015)Sung Kim
 
Software Defect Prediction on Unlabeled Datasets
Software Defect Prediction on Unlabeled DatasetsSoftware Defect Prediction on Unlabeled Datasets
Software Defect Prediction on Unlabeled DatasetsSung Kim
 
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)Sung Kim
 
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)Sung Kim
 
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...Sung Kim
 
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)Sung Kim
 
Source code comprehension on evolving software
Source code comprehension on evolving softwareSource code comprehension on evolving software
Source code comprehension on evolving softwareSung Kim
 
A Survey on Dynamic Symbolic Execution for Automatic Test Generation
A Survey on  Dynamic Symbolic Execution  for Automatic Test GenerationA Survey on  Dynamic Symbolic Execution  for Automatic Test Generation
A Survey on Dynamic Symbolic Execution for Automatic Test GenerationSung Kim
 
Survey on Software Defect Prediction
Survey on Software Defect PredictionSurvey on Software Defect Prediction
Survey on Software Defect PredictionSung Kim
 
MSR2014 opening
MSR2014 openingMSR2014 opening
MSR2014 openingSung Kim
 
Personalized Defect Prediction
Personalized Defect PredictionPersonalized Defect Prediction
Personalized Defect PredictionSung Kim
 
STAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash ReproductionSTAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash ReproductionSung Kim
 
Transfer defect learning
Transfer defect learningTransfer defect learning
Transfer defect learningSung Kim
 
Automatic patch generation learned from human written patches
Automatic patch generation learned from human written patchesAutomatic patch generation learned from human written patches
Automatic patch generation learned from human written patchesSung Kim
 
The Anatomy of Developer Social Networks
The Anatomy of Developer Social NetworksThe Anatomy of Developer Social Networks
The Anatomy of Developer Social NetworksSung Kim
 
A Survey on Automatic Test Generation and Crash Reproduction
A Survey on Automatic Test Generation and Crash ReproductionA Survey on Automatic Test Generation and Crash Reproduction
A Survey on Automatic Test Generation and Crash ReproductionSung Kim
 
How Do Software Engineers Understand Code Changes? FSE 2012
How Do Software Engineers Understand Code Changes? FSE 2012How Do Software Engineers Understand Code Changes? FSE 2012
How Do Software Engineers Understand Code Changes? FSE 2012Sung Kim
 

Plus de Sung Kim (20)

REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...
 
Heterogeneous Defect Prediction (

ESEC/FSE 2015)
Heterogeneous Defect Prediction (

ESEC/FSE 2015)Heterogeneous Defect Prediction (

ESEC/FSE 2015)
Heterogeneous Defect Prediction (

ESEC/FSE 2015)
 
A Survey on Automatic Software Evolution Techniques
A Survey on Automatic Software Evolution TechniquesA Survey on Automatic Software Evolution Techniques
A Survey on Automatic Software Evolution Techniques
 
Crowd debugging (FSE 2015)
Crowd debugging (FSE 2015)Crowd debugging (FSE 2015)
Crowd debugging (FSE 2015)
 
Software Defect Prediction on Unlabeled Datasets
Software Defect Prediction on Unlabeled DatasetsSoftware Defect Prediction on Unlabeled Datasets
Software Defect Prediction on Unlabeled Datasets
 
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
 
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
 
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
 
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
 
Source code comprehension on evolving software
Source code comprehension on evolving softwareSource code comprehension on evolving software
Source code comprehension on evolving software
 
A Survey on Dynamic Symbolic Execution for Automatic Test Generation
A Survey on  Dynamic Symbolic Execution  for Automatic Test GenerationA Survey on  Dynamic Symbolic Execution  for Automatic Test Generation
A Survey on Dynamic Symbolic Execution for Automatic Test Generation
 
Survey on Software Defect Prediction
Survey on Software Defect PredictionSurvey on Software Defect Prediction
Survey on Software Defect Prediction
 
MSR2014 opening
MSR2014 openingMSR2014 opening
MSR2014 opening
 
Personalized Defect Prediction
Personalized Defect PredictionPersonalized Defect Prediction
Personalized Defect Prediction
 
STAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash ReproductionSTAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash Reproduction
 
Transfer defect learning
Transfer defect learningTransfer defect learning
Transfer defect learning
 
Automatic patch generation learned from human written patches
Automatic patch generation learned from human written patchesAutomatic patch generation learned from human written patches
Automatic patch generation learned from human written patches
 
The Anatomy of Developer Social Networks
The Anatomy of Developer Social NetworksThe Anatomy of Developer Social Networks
The Anatomy of Developer Social Networks
 
A Survey on Automatic Test Generation and Crash Reproduction
A Survey on Automatic Test Generation and Crash ReproductionA Survey on Automatic Test Generation and Crash Reproduction
A Survey on Automatic Test Generation and Crash Reproduction
 
How Do Software Engineers Understand Code Changes? FSE 2012
How Do Software Engineers Understand Code Changes? FSE 2012How Do Software Engineers Understand Code Changes? FSE 2012
How Do Software Engineers Understand Code Changes? FSE 2012
 

Dernier

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 

Dernier (20)

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 

Dealing with Noise in Defect Prediction

  • 1. Dealing with Noise in bug prediction Sunghun Kim, Hongyu Zhang, Rongxin Wu and Liang Gong The Hong Kong University of Science & Technology Tsinghua University
  • 2. Where are the bugs? 2
  • 3. Where are the bugs? Complex files! [Menzies et al.] 2
  • 4. Where are the bugs? Modified files! Complex files! [Nagappan et al.] [Menzies et al.] 2
  • 5. Where are the bugs? Modified files! Complex files! [Nagappan et al.] [Menzies et al.] Nearby other bugs! [Zimmermann et al.] 2
  • 6. Where are the bugs? Modified files! Complex files! [Nagappan et al.] [Menzies et al.] Nearby other bugs! Previously fixed files [Zimmermann et al.] [Hassan et al.] 2
  • 7. Prediction model training instances (features+ labels) 3
  • 8. Prediction model training instances (features+ labels) Learner 3
  • 9. Prediction model training instances (features+ labels) ? Learner 3
  • 10. Prediction model training instances (features+ labels) ? Learner 3
  • 11. Prediction model training instances (features+ labels) ? Learner Prediction 3
  • 12. Prediction model training instances (features+ labels) ? Learner Prediction 3
  • 13. Training on software evolution is key • Software features can be used to predict bugs • Defect labels obtained from software evolution • Supervised learning algorithms Version Bug Archive Database 4
  • 14. Change classification 5 Kim, Whitehead Jr., Zhang: Classifying Software Changes: Clean or Buggy? (TSE 2008)
  • 15. Change classification bug-introducing (“bad”) X X X X 5 Kim, Whitehead Jr., Zhang: Classifying Software Changes: Clean or Buggy? (TSE 2008)
  • 16. Change classification BUILD A LEARNER bug-introducing (“bad”) X X X X 5 Kim, Whitehead Jr., Zhang: Classifying Software Changes: Clean or Buggy? (TSE 2008)
  • 17. Change classification BUILD A LEARNER bug-introducing (“bad”) X X X X new change 5 Kim, Whitehead Jr., Zhang: Classifying Software Changes: Clean or Buggy? (TSE 2008)
  • 18. Change classification BUILD A LEARNER bug-introducing (“bad”) X X X X new change PREDICT QUALITY 5 Kim, Whitehead Jr., Zhang: Classifying Software Changes: Clean or Buggy? (TSE 2008)
  • 19. Training Classifiers 0 1 0 1 0 1 0 1 … 0 1 Historical changes 0 0 0 1 0 1 0 1 … 0 0 0 1 1 1 0 1 1 1 … 0 0 0 1 0 3 0 0 0 1 … 0 1 0 1 0 1 0 1 0 1 … 0 0 § Machine learning techniques • Bayesian Network, SVM
  • 20. Source Repository Bug Database all commits C commit all bugs B commit commit commit commit commit fixed bugs Bf commit commit commit 7 Bird et al. “Fair and Balanced? Bias in Bug-Fix Datasets,” FSE2009
  • 21. Source Repository Bug Database all commits C commit all bugs B commit commit commit commit commit fixed bugs Bf commit commit commit linked via log messages 7 Bird et al. “Fair and Balanced? Bias in Bug-Fix Datasets,” FSE2009
  • 22. Source Repository Bug Database all commits C commit all bugs B commit commit commit commit commit fixed bugs Bf commit linked fixed bugs Bfl commit commit linked via log messages 7 Bird et al. “Fair and Balanced? Bias in Bug-Fix Datasets,” FSE2009
  • 23. Source Repository Bug Database all commits C commit all bugs B commit commit commit commit commit fixed bugs Bf commit linked fixes Cfl linked fixed bugs Bfl commit commit linked via log messages 7 Bird et al. “Fair and Balanced? Bias in Bug-Fix Datasets,” FSE2009
  • 24. Source Repository Bug Database all commits C commit all bugs B commit commit commit commit related, commit but not linked fixed bugs Bf commit linked fixes Cfl linked fixed bugs Bfl commit commit linked via log messages 7 Bird et al. “Fair and Balanced? Bias in Bug-Fix Datasets,” FSE2009
  • 25. Source Repository Bug Database all commits C commit all bugs B commit commit commit commit bug fixes Cf related, commit but not linked fixed bugs Bf commit linked fixes Cfl linked fixed bugs Bfl commit commit linked via log messages 7 Bird et al. “Fair and Balanced? Bias in Bug-Fix Datasets,” FSE2009
  • 26. Source Repository Bug Database all commits C oise! N all bugs B commit commit commit commit commit bug fixes Cf related, commit but not linked fixed bugs Bf commit linked fixes Cfl linked fixed bugs Bfl commit commit linked via log messages 7 Bird et al. “Fair and Balanced? Bias in Bug-Fix Datasets,” FSE2009
  • 27. Effect of training on superbiased data (Severity) Trained on all bugs Trained on biased data1 Trained on biased data2 0% 20% 40% 60% 80% 100% Bug Recall 8 Bird et al. “Fair and Balanced? Bias in Bug-Fix Datasets,” FSE2009
  • 28. Effect of training on superbiased data (Severity) Trained on all bugs Trained on biased data1 Trained on biased data2 0% 20% 40% 60% 80% 100% Bug Recall 9 Bird et al. “Fair and Balanced? Bias in Bug-Fix Datasets,” FSE2009
  • 29. Effect of training on superbiased data (Severity) Trained on all bugs Trained on biased data1 Trained on biased data2 0% 20% 40% 60% 80% 100% Bug Recall 10 Bird et al. “Fair and Balanced? Bias in Bug-Fix Datasets,” FSE2009
  • 30. Effect of training on superbiased data (Severity) Bias in bug severityon all bugs Trained affects BugCache on biased data1 Trained Trained on biased data2 0% 20% 40% 60% 80% 100% Bug Recall 10 Bird et al. “Fair and Balanced? Bias in Bug-Fix Datasets,” FSE2009
  • 31. Are defect prediction models learned from noisy data reliable? 11
  • 32. Study questions • Q1: How resistant a defect prediction model is to noise? • Q2: How much noise could be detected/ removed? • Q3: Could we remove noise to improve defect prediction performance? 12
  • 37. Study approach Training Bayes Net Testing 13
  • 38. Making noisy training instances Training Testing 14
  • 39. Making noisy training instances 1 Removing buggy labels False negative noise Training Testing 14
  • 40. Making noisy training instances 1 Removing buggy labels False negative noise Training Testing 2 Adding buggy labels False positive noise 14
  • 41. Prediction models buggy Rev n Rev n+1 ... ... ... ... change clean change classification 15
  • 42. Prediction models buggy Rev n Rev n+1 ... ... ... ... change clean change classification buggy File ... ... ... File clean file-level defect prediction 15
  • 43. Performance evaluation § 4 possible outcomes from prediction models § Classifying a buggy change as buggy (nb->b) § Classifying a buggy change as clean (nb->c) § Classifying a clean change as clean (nc->c) § Classifying a clean change as buggy (nc->b) nb->b nb->b § Precision = , Recall= nb->b + nc->b nb->b + nb->c precision ! recall § F-measure = 2 ! precision + recall 16
  • 44. Subjects change classification subject # instances % buggy # features Columba 1,800 29.4% 17,411 Eclipse (JDT) 659 10.1% 16,192 Scarab 1,090 50.6% 5,710 file-level defect prediction subject # instances % buggy # features SWT 1,485 44% 18 Debug 1,065 24.7% 18 17
  • 45. Experimental results $" !#," !#+" !#*" !"##$%&'()*+",) !#)" !#(" -./0123" !#'" !#&" 40115"6.7"-./0123" !#%" !#$" !" !" !#$" !#%" !#&" !#'" !#(" !#)" -./%0,*1212#%+)3%4*5+)%2)#*67)%-&8/%9%4*5+)%:;+167)%-&</%,*3) 18
  • 46. Columba $" !#," !#+" !#*" !"##$%&'()*+",) !#)" !#(" -./0123" !#'" !#&" 40115"6.7"-./0123" !#%" !#$" !" !" !#$" !#%" !#&" !#'" !#(" !#)" -./%0,*1212#%+)3%4*5+)%2)#*67)%-&8/%9%4*5+)%:;+167)%-&</%,*3) 19
  • 47. Columba $" !#," !#+" 1. Random guess (50% buggy, 50% clean) 2. Columba’s defect rate is about 30% !#*" !"##$%&'()*+",) 3. Precision = 0.3 and Recall =0.5 !#)" 4. F-measure = 0.375 (2*0.5*0.3)/(0.3+0.5) !#(" -./0123" !#'" !#&" 40115"6.7"-./0123" !#%" !#$" !" !" !#$" !#%" !#&" !#'" !#(" !#)" -./%0,*1212#%+)3%4*5+)%2)#*67)%-&8/%9%4*5+)%:;+167)%-&</%,*3) 19
  • 48. Columba $" !#," !#+" !#*" !"##$%&'()*+",) !#)" !#(" -./0123" !#'" !#&" 40115"6.7"-./0123" !#%" !#$" !" !" !#$" !#%" !#&" !#'" !#(" !#)" -./%0,*1212#%+)3%4*5+)%2)#*67)%-&8/%9%4*5+)%:;+167)%-&</%,*3) 20
  • 49. Eclipse (JDT) $" !#," !#+" !#*" !"##$%&'()*+",) !#)" !#(" -./0123" !#'" !#&" 45667"89:"-./0123" !#%" !#$" !" !" !#$" !#%" !#&" !#'" !#(" !#)" -./%0,*1212#%+)3%4*5+)%2)#*67)%-&8/%9%4*5+)%:;+167)%-&</%,*3) 21
  • 50. Scarab $" !#," !#+" !#*" !"##$%&'()*+",) !#)" !#(" -./0/1" !#'" !#&" 23445"670"-./0/1" !#%" !#$" !" !" !#$" !#%" !#&" !#'" !#(" !#)" -./%0,*1212#%+)3%4*5+)%2)#*67)%-&8/%9%4*5+)%:;+167)%-&</%,*3) 22
  • 51. Eclipse (Debug) $" !#," !#+" !#*" !"##$%&'()*+",)% !#)" !#(" -./01" !#'" !#&" -0223"456"-./01" !#%" !#$" !" !" !#$" !#%" !#&" !#'" !#(" !#)" -./%0,*1212#%+)3%4*5+)%2)#*67)%-&8/%9%4*5+)%:;+167)%-&</%,*3)% 23
  • 52. Eclipse (SWT) $" !#," !#+" !#*" !"##$%&'()*+",)% !#)" !#(" -./" !#'" !#&" 01223"456"-./" !#%" !#$" !" !" !#$" !#%" !#&" !#'" !#(" !#)" -./%0,*1212#%+)3%4*5+)%2)#*67)%-&8/%9%4*5+)%:;+167)%-&</%,*3)% 24
  • 53. Q1: How resistant a defect prediction model is to noise? $" !#," !#+" !#*" !"##$%&'()*+",) -./" !#)" !#(" 01234" !#'" 5673829" !#&" :;7<=>1" !#%" -;9?92" !#$" !" !" !#$" !#%" !#&" !#'" !#(" !#)" -./%0,*1212#%+)3%4*5+)%2)#*67)%-&8/%9%4*5+)%:;+167)%-&</%,*3) 25
  • 54. Q1: How resistant a defect prediction model is to noise? $" !#," !#+" !#*" !"##$%&'()*+",) -./" !#)" !#(" 01234" !#'" 5673829" !#&" :;7<=>1" !#%" -;9?92" !#$" !" !" !#$" !#%" !#&" !#'" !#(" !#)" -./%0,*1212#%+)3%4*5+)%2)#*67)%-&8/%9%4*5+)%:;+167)%-&</%,*3) 26
  • 55. Q1: How resistant a defect prediction model is to noise? $" !#," !#+" !#*" !"##$%&'()*+",) -./" !#)" !#(" 01234" !#'" 5673829" !#&" !#%" !#$" 20~30% :;7<=>1" -;9?92" !" !" !#$" !#%" !#&" !#'" !#(" !#)" -./%0,*1212#%+)3%4*5+)%2)#*67)%-&8/%9%4*5+)%:;+167)%-&</%,*3) 26
  • 56. Study questions • Q1: How resistant a defect prediction model is to noise? • Q2: How much noise could be detected/ removed? • Q3: Could we remove noise to improve defect prediction performance? 27
  • 57. Detecting noise 1 Removing buggy labels False negative noise Original training 2 Adding buggy labels False positive noise 28
  • 58. Detecting noise False negative noise Original training False positive noise 29
  • 59. 30 ts. However, it is very hard to get a golden set. In our approach, e carefully select high quality datasets and assume them the lden sets. We then add FPs and FNs intentionally to create a False positive noise ise set. To add FPs and FNs, we randomly selects instances in a lden set and artificially change their labels from buggy to clean from clean to buggy, inspired by experiments in [4]. Original training ? noise Clean False negative noise Detecting noise F igure 4. C reating biased training set make FN data sets (for RQ1), we randomly select n% buggy
  • 60. return Aj Closest 9. Thenoise identification algorit F igure list pseudo-code of the C LN I A 31
  • 61. Noise detection performance Precision Recall F-measure Debug 0.681 0.871 0.764 SWT 0.624 0.830 0.712 (noise level =20%) 32
  • 62. Noise detection performance 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 Precision 0.2 Recall 0.1 F-measure 0 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 FP & FN noise level Noise Rate 33
  • 63. Study questions • Q1: How resistant a defect prediction model is to noise? • Q2: How much noise could be detected/ removed? Q3: Could we remove noise to improve defect prediction performance? 34
  • 64. Bug prediction using cleaned data Noisey Cleaned 100 75 SWT F-measure 50 25 0 0% 15% 30% 45% 35 Noise level
  • 65. Bug prediction using cleaned data Noisey Cleaned 100 75 SWT F-measure 50 25 0 0% 15% 30% 45% 36 Noise level
  • 66. Bug prediction using cleaned data Noisey Cleaned 100 75 SWT F-measure 50 25 76% F-measure with 45% noise 0 0% 15% 30% 45% 36 Noise level
  • 67. Study limitations • All datasets are collected from open source projects • The golden set used in this paper may not be perfect • The noisy data simulations may not reflect the actual noise patterns in practice 37
  • 68. Summary • Prediction models (used in our experiments) are resistant (up to 20~30%) of noise • Noise detection is promising • Future work - Building oracle defect sets - Improving noise detection algorithms - Applying to more defect prediction models (regression, bugcache) 38