SlideShare une entreprise Scribd logo
1  sur  135
Bug Prediction & Analysis

      Marco D’Ambros
Bug Prediction and Analysis
Bug Prediction and Analysis
As users, we are used to bugs...
... and also as
    developers
But the perception in reverse engineering is different
But the perception in reverse engineering is different




       There are thousands of bugs
Prediction
Focus resources on
         bug-prone components
       Theory
 Prove correlations         Practice
with software metrics   Rank components
                         according to the
                          bug-proneness
Classification
                             Class A will/won't
Release x   Bug prediction   have bugs




                             Ranking
                             Class A will have more
                             bugs than class B
Classification
                             Class A will/won't
Release x   Bug prediction   have bugs


                                  Correct?
                             Ranking
                             Class A will have more
                             bugs than class B
Release x   Bug prediction
List of classes
                               ranked by the
Release x     Bug prediction     prediction




                                                                                         mccabe
                                                                               fanout
                                                                      fanin
                                               wloc



                                                        nom


                                                               noa




                                                                                                    ona


                                                                                                           pna
                                 noc


                                       loc
                                  -     -       -        -      -      -       -          -          -   -
                                 220   2532   143936   17324   6190    95      11        345        113 954
                                 220   2681   169205   19599   7078   132      13        450        170 1098
                                 220   2672   170149   19616   7067   135      12        459        171 1082
                                 228   2664   169693   19452   6982   136      12        470        171 1074
                                  -     -       -        -      -      -       -          -          -   -
                                  68    807    67027    5944   2930    48       1         36           0 289
                                  80    861    69064    6232   3018    59       3         59           4 313
                                  84   1045    75448    7227   3427    52       6         92           8 338
                                  69    990    69719    6203   2673    57       2         46           8 320
                                  75   1334   105783   10123   3181    79      59        281         33 311
                                  77   1528   108259   11082   3403    72      43        319         36 380
                                  78   1674   137852   12520   4196   115      57        326         44 405




                                                                                                             Prediction
                        Comparison
                                                                                                            performance


Release x+1   Bug extraction




                                                                                           mccabe
                                                                                fanout
                                                                       fanin
                                               wloc



                                                        nom


                                                                noa




                                                                                                     ona


                                                                                                            pna
                                 noc


                                       loc
                                  -     -       -        -      -      -         -         -         -   -
                                 220   2532   143936   17324   6190    95        11       345       113 954
                                 220   2681   169205   19599   7078   132        13       450       170 1098
                                 220   2672   170149   19616   7067   135        12       459       171 1082
                                 228   2664   169693   19452   6982   136        12       470       171 1074
                                  -     -       -        -      -      -         -         -         -   -
                                  68    807    67027    5944   2930    48         1        36          0 289
                                  80    861    69064    6232   3018    59         3        59          4 313
                                  84   1045    75448    7227   3427    52         6        92          8 338
                                  69    990    69719    6203   2673    57         2        46          8 320
                                  75   1334   105783   10123   3181    79        59       281        33 311
                                  77   1528   108259   11082   3403    72        43       319        36 380
                                  78   1674   137852   12520   4196   115        57       326        44 405




                               List of classes
                               ranked by the
                                 number of
                Bugzilla        actual bugs
               database
List of classes
                                                        ranked by the
                         Release x     Bug prediction     prediction




                                                                                                                  mccabe
                                                                                                        fanout
                                                                                               fanin
                                                                        wloc



                                                                                 nom


                                                                                        noa




                                                                                                                             ona


                                                                                                                                    pna
                                                          noc


                                                                loc
                                                           -     -       -        -      -      -       -          -          -   -
                                                          220   2532   143936   17324   6190    95      11        345        113 954
                                                          220   2681   169205   19599   7078   132      13        450        170 1098
                                                          220   2672   170149   19616   7067   135      12        459        171 1082
                                                          228   2664   169693   19452   6982   136      12        470        171 1074
                                                           -     -       -        -      -      -       -          -          -   -
                                                           68    807    67027    5944   2930    48       1         36           0 289
                                                           80    861    69064    6232   3018    59       3         59           4 313
                                                           84   1045    75448    7227   3427    52       6         92           8 338
                                                           69    990    69719    6203   2673    57       2         46           8 320
                                                           75   1334   105783   10123   3181    79      59        281         33 311
                                                           77   1528   108259   11082   3403    72      43        319         36 380
                                                           78   1674   137852   12520   4196   115      57        326         44 405




Svn / Cvs
repository   Check out


                                                                                                                                      Prediction
                                                 Comparison
                                                                                                                                     performance


                         Release x+1   Bug extraction




                                                                                                                    mccabe
                                                                                                         fanout
                                                                                                fanin
                                                                        wloc



                                                                                 nom


                                                                                         noa




                                                                                                                              ona


                                                                                                                                     pna
                                                          noc


                                                                loc
                                                           -     -       -        -      -      -         -         -         -   -
                                                          220   2532   143936   17324   6190    95        11       345       113 954
                                                          220   2681   169205   19599   7078   132        13       450       170 1098
                                                          220   2672   170149   19616   7067   135        12       459       171 1082
                                                          228   2664   169693   19452   6982   136        12       470       171 1074
                                                           -     -       -        -      -      -         -         -         -   -
                                                           68    807    67027    5944   2930    48         1        36          0 289
                                                           80    861    69064    6232   3018    59         3        59          4 313
                                                           84   1045    75448    7227   3427    52         6        92          8 338
                                                           69    990    69719    6203   2673    57         2        46          8 320
                                                           75   1334   105783   10123   3181    79        59       281        33 311
                                                           77   1528   108259   11082   3403    72        43       319        36 380
                                                           78   1674   137852   12520   4196   115        57       326        44 405




                                                        List of classes
                                                        ranked by the
                                                          number of
                                         Bugzilla        actual bugs
                                        database
List of classes
                                                        ranked by the
                         Release x     Bug prediction     prediction




                                                                                                                  mccabe
                                                                                                        fanout
                                                                                               fanin
                                                                        wloc



                                                                                 nom


                                                                                        noa




                                                                                                                             ona


                                                                                                                                    pna
                                                          noc


                                                                loc
                                                           -     -       -        -      -      -       -          -          -   -
                                                          220   2532   143936   17324   6190    95      11        345        113 954
                                                          220   2681   169205   19599   7078   132      13        450        170 1098
                                                          220   2672   170149   19616   7067   135      12        459        171 1082
                                                          228   2664   169693   19452   6982   136      12        470        171 1074
                                                           -     -       -        -      -      -       -          -          -   -
                                                           68    807    67027    5944   2930    48       1         36           0 289
                                                           80    861    69064    6232   3018    59       3         59           4 313
                                                           84   1045    75448    7227   3427    52       6         92           8 338
                                                           69    990    69719    6203   2673    57       2         46           8 320
                                                           75   1334   105783   10123   3181    79      59        281         33 311
                                                           77   1528   108259   11082   3403    72      43        319         36 380
                                                           78   1674   137852   12520   4196   115      57        326         44 405




Svn / Cvs
repository   Check out


                                                                                                                                      Prediction
                                                 Comparison
                                                                                                                                     performance


                         Release x+1   Bug extraction




                                                                                                                    mccabe
                                                                                                         fanout
                                                                                                fanin
                                                                        wloc



                                                                                 nom


                                                                                         noa




                                                                                                                              ona


                                                                                                                                     pna
                                                          noc


                                                                loc
                                                           -     -       -        -      -      -         -         -         -   -
                                                          220   2532   143936   17324   6190    95        11       345       113 954
                                                          220   2681   169205   19599   7078   132        13       450       170 1098
                                                          220   2672   170149   19616   7067   135        12       459       171 1082
                                                          228   2664   169693   19452   6982   136        12       470       171 1074
                                                           -     -       -        -      -      -         -         -         -   -
                                                           68    807    67027    5944   2930    48         1        36          0 289
                                                           80    861    69064    6232   3018    59         3        59          4 313
                                                           84   1045    75448    7227   3427    52         6        92          8 338
                                                           69    990    69719    6203   2673    57         2        46          8 320
                                                           75   1334   105783   10123   3181    79        59       281        33 311
                                                           77   1528   108259   11082   3403    72        43       319        36 380
                                                           78   1674   137852   12520   4196   115        57       326        44 405




                                                        List of classes
                                                        ranked by the
                                                          number of
                                         Bugzilla        actual bugs
                                        database
System release

                                        Parsing
                                                        FAMIX Class
                                                  Attribute
                                                   Attribute
                                                     Attribute
             check out

Svn / Cvs                                                Class / File
repository                Versioning                     link              Inferred
                         system logs                                       link
                log                     Parsing
                                                                    Commit
                                                                    comments

                                                         Bug reference
                         Bug reports                     in the comment
 Bugzilla      Query                    Parsing
database                                                   Bug
Classification             Ranking
Precision & recall   Spearman correlation
                          coefficient
Classification                Ranking
Precision & recall      Spearman correlation
                             coefficient



Buggy classes




    Classes predicted
        as buggy
Classification                 Ranking
Precision & recall       Spearman correlation
                              coefficient



Buggy classes
FN
        TP
                  FP
     Classes predicted
         as buggy
Classification                        Ranking
      Precision & recall                Spearman correlation
                                             coefficient
  How                       How
small FP is               small FN is

       Buggy classes
      FN
                 TP
                           FP
              Classes predicted
                  as buggy
Classification                             Ranking
      Precision & recall                 Spearman correlation
                                              coefficient
  How                       How
small FP is               small FN is                    Predicted
                                        Observed
                                         Class D             Class E
       Buggy classes
                                         Class A             Class A
      FN
                 TP
                                         Class E
                                           ...
                                                     ~       Class D
                                                               ...
                           FP              ...                 ...
              Classes predicted
                                           ...                 ...
                  as buggy
Approaches are based on:


History                   Metrics
Predicting Defects
    for Eclipse

Thomas Zimmermann
   Rahul Premraj
   Andreas Zeller

  Saarland University
Experimental settings

                        Release    #Files   #Packages

                          2.0      6740       376
                          2.1      7900       433
                          3.0      6614       429

                                Pre-release defects
                                Post-release defects

                          6     months before/after
                                release
Classification of classes
Using logistic regression models



                                  max recall 0.38
          Buggy classes
         FN
                                        max precision 0.68
                  TP
                           FP
              Classes predicted
                  as buggy
Ranking classes
     McCabe complexity     0.401


        Method LOC         0.405


          Total LOC         0.42


    Linear regression model 0.416


                       Pre-release defects

0               0.25          0.50           0.75      1.00
Ranking classes
     McCabe complexity      0.401


        Method LOC          0.405


          Total LOC          0.42


    Linear regression model 0.416


                        Pre-release defects
                       Pre-release defects           0.907

0               0.25            0.50          0.75       1.00
Conclusion

  Past defects is the
 predictor for future
        defects
Conclusion
                            Software metrics
  Past defects is the   correlate with defects but
 predictor for future   are not usable in practice
        defects
Mining metrics to predict
  component failures

 Nachiappan Nagappan
     Thomas Ball
  Microsoft Research
      Andreas Zeller
    Saarland University
Experimental settings

                            Project           Code size

                        Internet Explorer 6   511 KLOC

                             DirectX          306 KLOC

                        Process messaging
                                              147 KLOC
                           component

                           NetMeeting         109 KLOC

                             IIS Core          37 KLOC


                            Granularity level: module
Experimental settings

                            Project              Code size

                        Internet Explorer 6       511 KLOC

                             DirectX              306 KLOC

                        Process messaging
                                                  147 KLOC
                           component

                           NetMeeting             109 KLOC

                             IIS Core             37 KLOC


                            Granularity level: module

                                                (a binary file
                                              within Windows)
Experimental settings

                                 Project              Code size

                             Internet Explorer 6       511 KLOC

                                  DirectX              306 KLOC

                             Process messaging
                                                       147 KLOC
                                component

                                NetMeeting             109 KLOC

                                  IIS Core             37 KLOC


                                 Granularity level: module

                                                     (a binary file
                        A set of classes
                                                   within Windows)
Q1   Do complexity metrics correlate with defects?
Q1          Do complexity metrics correlate with defects?
       Maximum correlation
       Percentage of correlated metrics
1.00




0.75




0.50




0.25




  0
              A                  B        C   D         E
Q2   Is there a unique set of metrics that predicts
     defects in all projets?
Q3   Can we combine metrics to predict defect?
Q3     Can we combine metrics to predict defect?

Multicollinearity
  of metrics
Q3     Can we combine metrics to predict defect?
                        Principal
Multicollinearity
                       Component
  of metrics
                         analysis
Q3     Can we combine metrics to predict defect?
                        Principal          Linear/logistic
Multicollinearity
                       Component             regression
  of metrics
                         analysis              model
Q3        Can we combine metrics to predict defect?
                                           Principal       Linear/logistic
 Multicollinearity
                                          Component          regression
   of metrics
                                            analysis           model
       Spearman/Pearson correlation
       Percentage of splits which correlate

1.00


0.75


0.50


0.25


  0
              A                   B            C       D             E
Q3        Can we combine metrics to predict defect?
                                           Principal            Linear/logistic
 Multicollinearity
                                          Component               regression
   of metrics
                                            analysis                model
       Spearman/Pearson correlation
       Percentage of splits which correlate
                                                       Too few samples
1.00


0.75


0.50


0.25


  0
              A                   B            C            D             E
Q4   Are predictors obtained from one project
     applicable to other projects?
Conclusion


     Metrics can be used
      to predict defects
Conclusion


     Metrics can be used
      to predict defects


             but
Conclusion


     Metrics can be used
      to predict defects


             but


    they must be validated
        on the history
Improving Defect Prediction
Using Temporal Features and
     Non Linear Models

    Abraham Bernstein
    Jayalath Ekanayake
      Martin Pinzger

    University of Zurich
Experimental settings

                           Plugin     #Years   #Files
                          updateui      7       757
                         updatecore     7       459
                           search      6.5      540
                           pdeui       6.5     1621
                          pdebuild      6       198
                          compare      6.5      315


                        Non linear models based on
                            21 historical metrics
                                      +
                                    LOC
Classification of files
Using decision tree learners


          All files: A                       Size(CC)
                               Accuracy =
                                            Size(A)

          Correctly
          classified
          files: CC
Classification of files
Using decision tree learners


          All files: A                       Size(CC)
                               Accuracy =
                                            Size(A)

          Correctly
          classified
          files: CC
                                Best predictor (7 metrics)
                                    Accuracy 99.16%
Ranking of files
Using m5 tree regression algorithm

    Sperman correlation


               Predictor based on 7 metrics      0.966




         Zimmermann’s pre-release defects     0.907


0             0.243       0.485       0.728   0.970
Conclusion


       Defect prediction can be improved with:


 Historical information      Non-linear function
Predicting Faults Using the
Complexity of Code Changes



     Ahmed E. Hassan

    Queen’s University
ntuition is that one change affecting one file only is simpler
   Complexity = Entropy
 han one affecting many different files, as the developer who
has to more changeschange has to keep trackthe entropy
   The perform the are distributed the higher of all them.
Hassan proposed to use Shannon Entropy defined as
                           Shannon Entropy
                                 n
                                 X
                 Hn (P ) = −           pk ∗ log2 pk               (1)
                                 k=1

  where pk is the probability that the file k changes during
   File A
 he considered time interval. Figure 4 shows an example
with three files and three time intervals.
   File B

   File C
 File A
                                                           Time
 File B     t1 (2 weeks)       t2 (2 weeks)      t3 (2 weeks)

 File C
ntuition is that one change affecting one file only is simpler
   Complexity = Entropy
 han one affecting many different files, as the developer who
has to more changeschange has to keep trackthe entropy
   The perform the are distributed the higher of all them.
Hassan proposed to use Shannon Entropy defined as
                            Shannon Entropy
                                  n
                                  X
                  Hn (P ) = −           pk ∗ log2 pk               (1)
                                  k=1

  where pk is the probability that the file k changes during
   File A
 he considered time interval. Figure 4 shows an example
with three files and three time intervals.
   File B

   File C
 File A
                                                            Time
 File B      t1 (2 weeks)       t2 (2 weeks)      t3 (2 weeks)

 FileHn(P)
      C      =
ntuition is that one change affecting one file only is simpler
   Complexity = Entropy
 han one affecting many different files, as the developer who
has to more changeschange has to keep trackthe entropy
   The perform the are distributed the higher of all them.
Hassan proposed to use Shannon Entropy defined as
                               Shannon Entropy
                                     n
                                     X
                     Hn (P ) = −           pk ∗ log2 pk               (1)
            4                        k=1

  where pk is the probability that the file k changes during
   File A
 he considered time interval. Figure 4 shows an example
with three files and three time intervals.
   File B

   File C
 File A
                                                               Time
 File B         t1 (2 weeks)       t2 (2 weeks)      t3 (2 weeks)

 FileHn(P)
      C      =
ntuition is that one change affecting one file only is simpler
   Complexity = Entropy
 han one affecting many different files, as the developer who
has to more changeschange has to keep trackthe entropy
   The perform the are distributed the higher of all them.
Hassan proposed to use Shannon Entropy defined as
                               Shannon Entropy
                                     n
                                     X
                     Hn (P ) = −           pk ∗ log2 pk               (1)
            4              2
                                     k=1

                      4
  where pk is the probability that the file k changes during
   File A
 he considered time interval. Figure 4 shows an example
with three files and three time intervals.
   File B

   File C
 File A
                                                               Time
 File B         t1 (2 weeks)       t2 (2 weeks)      t3 (2 weeks)

 FileHn(P)
      C      = - 2 4 * log2 4
                          2
ntuition is that one change affecting one file only is simpler
   Complexity = Entropy
 han one affecting many different files, as the developer who
has to more changeschange has to keep trackthe entropy
   The perform the are distributed the higher of all them.
Hassan proposed to use Shannon Entropy defined as
                               Shannon Entropy
                                     n
                                     X
                     Hn (P ) = −           pk ∗ log2 pk               (1)
            4              2
                                     k=1

                      4
  where pk is the probability that the file k changes during
   File A
 he considered time1interval. Figure 4 shows an example
                     4 time intervals.
with three files and three
   File B

   File C
 File A
                                                               Time
 File B         t1 (2 weeks)       t2 (2 weeks)      t3 (2 weeks)

 FileHn(P)
      C      = - 2 4 * log2 4 - 1 4 * log2 4
                          2              1
ntuition is that one change affecting one file only is simpler
   Complexity = Entropy
 han one affecting many different files, as the developer who
has to more changeschange has to keep trackthe entropy
   The perform the are distributed the higher of all them.
Hassan proposed to use Shannon Entropy defined as
                               Shannon Entropy
                                      n
                                      X
                        Hn (P ) = −         pk ∗ log2 pk               (1)
            4              2
                                      k=1

                      4
  where pk is the probability that the file k changes during
   File A
 he considered time1interval. Figure 4 shows an example
                     4 time intervals.
with three files and three
   File B
                1
   File C
 File A             4
                                                                Time
 File B         t1 (2 weeks)       t2 (2 weeks)       t3 (2 weeks)

 FileHn(P)
      C                                  1               1
             = - 2 4 * log2 4 - 1 4 * log2 4 - 1 4 * log 2 4
                          2
ntuition is that one change affecting one file only is simpler
   Complexity = Entropy
 han one affecting many different files, as the developer who
has to more changeschange has to keep trackthe entropy
   The perform the are distributed the higher of all them.
Hassan proposed to use Shannon Entropy defined as
                               Shannon Entropy
                                      n
                                      X
                        Hn (P ) = −         pk ∗ log2 pk               (1)
            4              2
                                      k=1

                      4
  where pk is the probability that the file k changes during
   File A
 he considered time1interval. Figure 4 shows an example
                     4 time intervals.
with three files and three
   File B
                1
   File C
 File A             4
                                                                Time
 File B         t1 (2 weeks)       t2 (2 weeks)       t3 (2 weeks)

 FileHn(P)
      C                                  1               1
             = - 2 4 * log2 4 - 1 4 * log2 4 - 1 4 * log 2 4 = 1
                          2
ntuition is that one change affecting one file only is simpler
   Complexity = Entropy
 han one affecting many different files, as the developer who
has to more changeschange has to keep trackthe entropy
   The perform the are distributed the higher of all them.
Hassan proposed to use Shannon Entropy defined as
                           Shannon Entropy
                                 n
                                 X
                 Hn (P ) = −           pk ∗ log2 pk               (1)
                                 k=1

  where pk is the probability that H > 1? k changes during
                  H=1              the file
   File A
 he considered time interval. Figure 4 shows an example
with three files and three time intervals.
   File B

   File C
 File A
                                                           Time
 File B     t1 (2 weeks)       t2 (2 weeks)      t3 (2 weeks)

 File C
ntuition is that one change affecting one file only is simpler
   Complexity = Entropy
 han one affecting many different files, as the developer who
has to more changeschange has to keep trackthe entropy
   The perform the are distributed the higher of all them.
Hassan proposed to use Shannon Entropy defined as
                           Shannon Entropy
                                 n
                                 X
                 Hn (P ) = −           pk ∗ log2 pk               (1)
                                 k=1

                  H=1                          H > 1?
  where pk is the probability that the file k changes during
   File A
 he considered time interval. Figure 4 shows an example
with three files and three time intervals.
   File B

   File C
 File A
                                                           Time
 File B     t1 (2 weeks)       t2 (2 weeks)      t3 (2 weeks)

 File C
ned as:in the last six months). file juse H ,entropy F
 modified
   Complexity Metric (HCM) of a c ∗ the    To as
                                                             j∈ i
    Historyas bug predictor, Hassan   
              of Complexity Metric (HCM)
 e change HCP F (j) = X defined the
                       i (j) =
                                            ij         i History
mplexity Metric {a,..,b} of a file j 0, ij ∗ i (j) , otherw
             HCM (HCM)                     asc
                                             HCP F H
                                                         i     j∈F
                                                                 (3)
              HCP Fi (j) =    X   i∈{a,..,b}
        HCM{a,..,b} (j) =                    0,
                                      HCP Fi (j)               other
                                                             (3)
e i is a.., b} is a set of evolution periods iand HCP the
 here {a, period with entropy H ,Set i is F is
                           i∈{a,..,b}                    F of
efined as:
  {a, b}     period i and j periods andHmodified filesto
re i..,is is a set of with ∗ is ,a j ∈ F HCPFiisis
 n the a periodevolutionentropy belongingth
                                            file i , F
                               cij Hi                i
e definition of icij , there otherwise
din theHCP Fi (j) = and j is a file belonging
   as:        period          0,             are three types    (4)
                         cij ∗ Hi , j ∈ Fi
he definition ofentropy there are three mod-
       i is a Fi (j) with0, cij , Hotherwise set of files typ
 here HCP period=                        , Fi is the         (4)
  (1) the period i and jHis Mfilebelonging to Fentropy of co
  ed in
         cij = 1, everya file modifiedi .in the
                             C Each file gets the According
                                       i


oi the definition ofentropy Hiarei three types of HCM :the c
  iisgets ij with1,ijevery,of the systemmod-the
      a period = entropy the is the set of files in
             the c , there file modified in
                                    F system
    (1) c i and j is a file belonging to F . According
n the period                                       i
  interval. 1,This file modified approach: HCM
  definition of cijevery defines types ofconsidered in th
   1. (1) cij = , entropy of the system period
    i gets the there areMthree in the HCM
      i gets the entropy of C system in the considered its
                           H the Each file is weighted with time
                        W defines considered period
 1)interval. This approach HCM.
     cij = 1, every file modified in the approach HCM
      interval. This defines
  (2) the entropyjof the system in the consideredmodified
  gets   cij = p , each modified being gets the
                                   probability of file time
In EDHCM (Exponentially Decayed HCM) , entropies f
earlier with decaytime, i.e., earlier modifications, have the
  HCM periods of factors
contribution reduced exponentially over time, modelling a
exponential decay model. EDHCM was introduced by Ha
san. Similarly, LDHCM (Linearly Decayed) and LGDHC
(LoGarithmically decayed), have their contributions reduc
over time in a respectively linear and logarithmic fashio
Both are novel. The definition of the variants follow:
                               P                 HCP Fi (j)
  EDHCM{a,..,b} (j) =            i∈{a,..,b} eφ1 ∗(|{a,..,b}|−i)    (
                             P                   HCP Fi (j)
  LDHCM{a,..,b} (j) =           i∈{a,..,b} φ2 ∗(|{a,..,b}|+1−i) (
                           P                     HCP Fi (j)
 LGDHCM{a,..,b} (j) =         i∈{a,..,b} φ3 ∗ln(|{a,..,b}|+1.01−i) (

  where φ1 , φ2 and φ3 are the decay factors.
In EDHCM (Exponentially Decayed HCM) , entropies f
earlier with decaytime, i.e., earlier modifications, have the
  HCM periods of factors
contribution reduced exponentially over time, modelling a
exponential decay model. EDHCM was introduced by Ha
san. Similarly, LDHCM (Linearly Decayed) and LGDHC
(LoGarithmically decayed), have their contributions reduc
overExponentially decayed
      time in a respectively linear and logarithmic fashio
Both are novel. The definition of the variants follow:
                               P                 HCP Fi (j)
  EDHCM{a,..,b} (j) =            i∈{a,..,b} eφ1 ∗(|{a,..,b}|−i)    (
                             P                   HCP Fi (j)
  LDHCM{a,..,b} (j) =           i∈{a,..,b} φ2 ∗(|{a,..,b}|+1−i) (
                           P                     HCP Fi (j)
 LGDHCM{a,..,b} (j) =         i∈{a,..,b} φ3 ∗ln(|{a,..,b}|+1.01−i) (

  where φ1 , φ2 and φ3 are the decay factors.
In EDHCM (Exponentially Decayed HCM) , entropies f
earlier with decaytime, i.e., earlier modifications, have the
  HCM periods of factors
contribution reduced exponentially over time, modelling a
exponential decay model. EDHCM was introduced by Ha
san. Similarly, LDHCM (Linearly Decayed) and LGDHC
(LoGarithmically decayed), have their contributions reduc
overExponentially decayed
      time in a respectively linear and logarithmic factor
                                             Exponential     fashio
Both are novel. The definition of the variants follow:
                               P                 HCP Fi (j)
  EDHCM{a,..,b} (j) =            i∈{a,..,b} eφ1 ∗(|{a,..,b}|−i)    (
                             P                   HCP Fi (j)
  LDHCM{a,..,b} (j) =           i∈{a,..,b} φ2 ∗(|{a,..,b}|+1−i) (
                           P                     HCP Fi (j)
 LGDHCM{a,..,b} (j) =         i∈{a,..,b} φ3 ∗ln(|{a,..,b}|+1.01−i) (

  where φ1 , φ2 and φ3 are the decay factors.
Experimental settings

                          System  Start date #Subsystem
                         NetBSD March 1993      235
                         FreeBSD  June 1993     152
                         OpenBSD Oct 1995       265
                          Postgre  July 1996    280
                           KDE    April 1997    108
                          KOffice  April 1998    158

                            Entropy metrics
                        Number of past modifications
                          Number of past defects

                              Subsystem level
2
  Models fitting in terms of R

Past defects



Past changes



      HCM



    WHCM



   EDHCM


               0              0.2              0.4             0.6
               NetBSD   FreeBSD     OpenBSD   Postgres   KDE   KOffice
Prediction error
Number of past changes vs Entropy

 NetBSD


FreeBSD


OpenBSD


 Postgres


    KDE


  KOffice

            0               12.5       25.0                37.5
                #Changes - WHCM (%)   #Changes - EDHCM (%)
Prediction error
Number of past defects vs Entropy

 NetBSD


FreeBSD


OpenBSD


 Postgres


    KDE


  KOffice

        -20.0    -10.0         0      10.0          20.0        30.0   40.0
                #Defects - WHCM (%)          #Defects - EDHCM (%)
Conclusion



        Models based on entropy of changes
         are better defects predictor s than
         number o  f past changes or defects
Conclusion



        Models based on entropy of changes
         are better defects predictor s than
         number o  f past changes or defects

             A complex code change process
             negatively affects its product, the
                     software system
Epilogue
Epilogue
Defect prediction research
has been active for several
          year

   A large number of
  scientific papers have
     been published
Epilogue
         We can predict defects
                      but
results have still limited practical usability
Epilogue
      Predicting bugs is very difficult




because developing code is a human activity
Epilogue
A human activity influenced by too many factors

     How complex was the piece of code?
                 How tested?
    How experienced was the developer?
Epilogue
A human activity influenced by too many factors

     How complex was the piece of code?
                 How tested?
    How experienced was the developer?


       How tired was the developer?
How integrated was the developer in the team?
             Did he like his job?
Epilogue
 A human activity influenced by too many factors

F OC US
      How complex was the piece of code?
                  How tested?
     How experienced was the developer?


        How tired was the developer?
 How integrated was the developer in the team?
              Did he like his job?
Epilogue
   A human activity influenced by too many factors

 F OC US
        How complex was the piece of code?
                    How tested?
        How experienced was the developer?


 od Hata ow tired was the developer?
N
  y etintegrated was the developer in the team?
  How
                 Did he like his job?
Analysis
Detect the critical bugs
        properties of components
         number of bugs
Detect the critical bugs
        properties of components

        number of bugs
Detect the critical components

        number of bugs

       properties of bugs
bugzero     bugzilla          census
customerfirst            defect-agent
extraview-bug-tracker           fast-
bugtrack fogbugz gnats ibm-
rational-clearquest ictracker issue-
organizer        issuenet-intercept
issueview   jira   legendsoft-spots
mantis       new-fire  omnitracker
pointinsight           pr-tracker
problemtracker quickbugs radar
razor        rmtrack-bug-tracking
4   facts
    about bugs
Bugs are differently
     harmful

                         Blocker

                         Critical

                          Major

                         Normal

                          Minor

                          Trivial

                       Enhancement
Bugs are differently
     harmful

                             Blocker

                             Critical
               Bugzil la is used to repor t
                               Major
                               gs
                            buNormal
                 and change requests
                               Minor

                             Trivial

                          Enhancement
Bugs are differently
     harmful

                             Blocker

                             Critical
               Bugzil la is used to repor t
                               Major
                               gs
                            buNormal
                 and change requests
                               Minor

                             Trivial

                          Enhancement
Bugs are
 graphs
Bugs evolve
An ideal bug life cycle


         Unconfirmed
An ideal bug life cycle


         Unconfirmed              Verified




  New                 Resolved             Closed




          Assigned
A bit less ideal


         Unconfirmed              Verified




  New                 Resolved             Closed




           Assigned
A bit less ideal


         Unconfirmed               Verified




  New                 Resolved              Closed




           Assigned              Reopened
The reality


         Unconfirmed               Verified




  New                 Resolved              Closed




          Assigned               Reopened
The reality


         Unconfirmed               Verified




  New                 Resolved              Closed




          Assigned               Reopened
All bug properties can change over time

              Bug
             Problem
 id        description
   product      component
           Criticality
  severity            priority
     Involved people
 assignedTo     reporter         qa
              State
  Status             Resolution
               ...
All bug properties can change over time

              Bug                                                Bug
             Problem                                            Problem
 id        description                              id        description
   product      component                             product      component
           Criticality                Activity                Criticality
  severity            priority                       severity            priority
     Involved people                                    Involved people
   steve
 assignedTo     reporter         qa
                                       AssignedTo     mike
                                                    assignedTo     reporter         qa
              State                   steve john                 State
  Status             Resolution                      Status             Resolution
               ...                                                ...
All bug properties can change over time

              Bug                                                   Bug
             Problem                                                Problem
 id        description                                  id        description
   product      component                                 product      component
           Criticality                Activity                    Criticality
  severity            priority                           severity            priority
     Involved people                                        Involved people
   steve
 assignedTo     reporter         qa
                                       AssignedTo     mike
                                                    assignedTo         reporter          qa
              State                   steve john                     State
  Status             Resolution                          Status             Resolution
               ...                                                    ...

                         i
                              B
                              P
                              de         i
                                              B
                                              P
                                              de    i
                                                          B
                                                          P
                                                          de                  i
                                                                                   B
                                                                                   P
                                                                                   de
Bug history                    C               C          C                         C
                              Inv             Inv        Inv                       Inv
                             S SR            S SR       S SR                      S SR
Are there many activities?
 How long do they live?
Are there many activities?
            How long do they live?

Time period Sep 1998 - Apr 2003
     #Bugs               255’302
 #Activities            2’706’201
Number of activities
30%

25%

20%

15%

10%

5%

0%
      0     1-3   4-5   6-10   11-20   21-30   > 30
Lifetime (reported - last activity)
40%


32%


24%


16%


8%


0%
      12 Hours   1 Day   1 Week   1 Month   6 Months   1 Year   2 Years   More
Lifetime (reported - last activity)
40%


32%
                                                       > 50%
24%


16%


8%


0%
      12 Hours   1 Day   1 Week   1 Month   6 Months   1 Year   2 Years   More
Bugs have long and intense lives
4        facts
                       about bugs


   are         are           evolves    have
differently   graphs                   long and
 harmful                               intense
                                         lives
There is a need of analyzing bug repositories




     Analyzing bugs as evolving entities
“A Bug’s Life”
Visualizing a Bug Database

     Marco D’Ambros
      Michele Lanza
      Martin Pinzger
System radiography view

    “Where (in the system and in its history) are
             the open bugs located?”
System radiography view

              “Where (in the system and in its history) are
                       the open bugs located?”

        Visualization principle
                                             •System decomposition on the
Component 1                                   y axis
                                 Product A
Component 2

                                             •Product :: Component
                                 Product B




                          Time
System radiography view

              “Where (in the system and in its history) are
                       the open bugs located?”

        Visualization principle
                                                   •System decomposition on the
Component 1                                         y axis
                                       Product A
Component 2


               y position
                               Color
                               #bugs
                                                   •Product :: Component
                                                    • (x,y) : (time, component)
              Component
                                       Product B

                  x position

                                                    • Color: # open bugs
               Time Interval


                                Time
System radiography view

              “Where (in the system and in its history) are
                       the open bugs located?”

        Visualization principle
                                                   •System decomposition on the
Component 1                                         y axis
                                       Product A
Component 2


               y position
                               Color
                               #bugs
                                                   •Product :: Component
                                                    • (x,y) : (time, component)
              Component
                                       Product B

                  x position

                                                    • Color: # open bugs
               Time Interval


                                Time
Mozilla example [Sep ‘98 - Apr ‘03]


                                  aggiungere transizione
                                  alla prossima slide,
                                  volendo anche nel filmato
Mozilla example [Sep ‘98 - Apr ‘03]


                                   aggiungere transizione
                                   alla prossima slide,
                                   volendo anche nel filmato

Browser
Mozilla example [Sep ‘98 - Apr ‘03]


                                   aggiungere transizione
                                   alla prossima slide,
                                   volendo anche nel filmato

Browser




Mailnews
Mozilla example [Sep ‘98 - Apr ‘03]


                                   aggiungere transizione
                                   alla prossima slide,
                                   volendo anche nel filmato

Browser




Mailnews
The Bug Watch View
“How are bugs characterized with respect to their history?”
The Bug Watch View
“How are bugs characterized with respect to their history?”

   Visualization principle
    End: 10/16/2001 Beginning: 10/19/1999
                                 Time
The Bug Watch View
“How are bugs characterized with respect to their history?”

   Visualization principle
    End: 10/16/2001 Beginning: 10/19/1999   • 3 Layers
                                 Time
The Bug Watch View
“How are bugs characterized with respect to their history?”

   Visualization principle
    End: 10/16/2001 Beginning: 10/19/1999   • 3 Layers
                                 Time        • Status
The Bug Watch View
“How are bugs characterized with respect to their history?”

   Visualization principle
    End: 10/16/2001 Beginning: 10/19/1999   • 3 Layers
                                 Time        • Status
                                                 Status      From         To
                                                Assigned   10/19/99   12/21/99
                                               Resolved    12/21/99    1/31/00
                                               Reopened     1/31/00     2/6/00
                                                 New         2/6/00     6/5/00
                                                    ...        ...         ...
The Bug Watch View
“How are bugs characterized with respect to their history?”

   Visualization principle
    End: 10/16/2001 Beginning: 10/19/1999   • 3 Layers
                                 Time        • Status
                                                 Status      From         To
                                                Assigned   10/19/99   12/21/99
                                               Resolved    12/21/99    1/31/00
                                               Reopened     1/31/00     2/6/00
                                                 New         2/6/00     6/5/00
                                                    ...        ...         ...
The Bug Watch View
“How are bugs characterized with respect to their history?”

   Visualization principle
    End: 10/16/2001 Beginning: 10/19/1999   • 3 Layers
                                 Time        • Status
                                                 Status      From         To
                                                Assigned   10/19/99   12/21/99
                                               Resolved    12/21/99    1/31/00
                                               Reopened     1/31/00     2/6/00
                                                 New         2/6/00     6/5/00
                                                    ...        ...         ...
The Bug Watch View
“How are bugs characterized with respect to their history?”

   Visualization principle
    End: 10/16/2001 Beginning: 10/19/1999   • 3 Layers
                                 Time        • Status
                                                 Status      From         To
                                                Assigned   10/19/99   12/21/99
                                               Resolved    12/21/99    1/31/00
                                               Reopened     1/31/00     2/6/00
                                                 New         2/6/00     6/5/00
                                                    ...        ...         ...
The Bug Watch View
“How are bugs characterized with respect to their history?”

   Visualization principle
    End: 10/16/2001 Beginning: 10/19/1999   • 3 Layers
                                 Time        • Status
                                                 Status      From         To
                                                Assigned   10/19/99   12/21/99
                                               Resolved    12/21/99    1/31/00
                                               Reopened     1/31/00     2/6/00
                                                 New         2/6/00     6/5/00
                                                    ...        ...         ...
The Bug Watch View
“How are bugs characterized with respect to their history?”

   Visualization principle
    End: 10/16/2001 Beginning: 10/19/1999   • 3 Layers
                                 Time        • Status
                                                 Status      From         To
                                                Assigned   10/19/99   12/21/99
                                               Resolved    12/21/99    1/31/00
                                               Reopened     1/31/00     2/6/00
                                                 New         2/6/00     6/5/00
                                                    ...        ...         ...

                                             • Activity
The Bug Watch View
“How are bugs characterized with respect to their history?”

   Visualization principle
    End: 10/16/2001 Beginning: 10/19/1999   • 3 Layers
                                 Time        • Status
                                                 Status      From         To
                                                Assigned   10/19/99   12/21/99
                                               Resolved    12/21/99    1/31/00
                                               Reopened     1/31/00     2/6/00
                                                 New         2/6/00     6/5/00
                                                    ...        ...         ...

                                             • Activity
                                             • Severity
tell more about the

Examples from Mozilla                      clustering

                                           dire cosa e’ la grandezza



Browser :: Networking [Nov ‘02- Apr ‘03]
tell more about the

Examples from Mozilla                      clustering

                                           dire cosa e’ la grandezza



Browser :: Networking [Nov ‘02- Apr ‘03]

 Reopened 4 times
 Developer in charge to
 fix it changed 6 times
 Many people added in
 the CC
tell more about the

Examples from Mozilla                      clustering

                                           dire cosa e’ la grandezza



Browser :: Networking [Nov ‘02- Apr ‘03]
tell more about the

Examples from Mozilla                         clustering

                                              dire cosa e’ la grandezza



Browser :: Networking [Nov ‘02- Apr ‘03]




                One status but many
                activities (addition of CC)
Conclusion

    Analyzing a bug database


                         Provides useful insights in
                            a software system

                         Helps in detecting the
                          most harmful bugs
Epilogue
Epilogue
We are just touching
    the surface




The analysis of bug
repositories is still a
  very open field

Contenu connexe

Dernier

How to Add Existing Field in One2Many Tree View in Odoo 17
How to Add Existing Field in One2Many Tree View in Odoo 17How to Add Existing Field in One2Many Tree View in Odoo 17
How to Add Existing Field in One2Many Tree View in Odoo 17Celine George
 
10 Topics For MBA Project Report [HR].pdf
10 Topics For MBA Project Report [HR].pdf10 Topics For MBA Project Report [HR].pdf
10 Topics For MBA Project Report [HR].pdfJayanti Pande
 
How to Send Emails From Odoo 17 Using Code
How to Send Emails From Odoo 17 Using CodeHow to Send Emails From Odoo 17 Using Code
How to Send Emails From Odoo 17 Using CodeCeline George
 
Optical Fibre and It's Applications.pptx
Optical Fibre and It's Applications.pptxOptical Fibre and It's Applications.pptx
Optical Fibre and It's Applications.pptxPurva Nikam
 
Vani Magazine - Quarterly Magazine of Seshadripuram Educational Trust
Vani Magazine - Quarterly Magazine of Seshadripuram Educational TrustVani Magazine - Quarterly Magazine of Seshadripuram Educational Trust
Vani Magazine - Quarterly Magazine of Seshadripuram Educational TrustSavipriya Raghavendra
 
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptxSandy Millin
 
How to Add a New Field in Existing Kanban View in Odoo 17
How to Add a New Field in Existing Kanban View in Odoo 17How to Add a New Field in Existing Kanban View in Odoo 17
How to Add a New Field in Existing Kanban View in Odoo 17Celine George
 
Over the counter (OTC)- Sale, rational use.pptx
Over the counter (OTC)- Sale, rational use.pptxOver the counter (OTC)- Sale, rational use.pptx
Over the counter (OTC)- Sale, rational use.pptxraviapr7
 
Protein Structure - threading Protein modelling pptx
Protein Structure - threading Protein modelling pptxProtein Structure - threading Protein modelling pptx
Protein Structure - threading Protein modelling pptxvidhisharma994099
 
How to Create a Toggle Button in Odoo 17
How to Create a Toggle Button in Odoo 17How to Create a Toggle Button in Odoo 17
How to Create a Toggle Button in Odoo 17Celine George
 
Work Experience for psp3 portfolio sasha
Work Experience for psp3 portfolio sashaWork Experience for psp3 portfolio sasha
Work Experience for psp3 portfolio sashasashalaycock03
 
AUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptxAUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptxiammrhaywood
 
Slides CapTechTalks Webinar March 2024 Joshua Sinai.pptx
Slides CapTechTalks Webinar March 2024 Joshua Sinai.pptxSlides CapTechTalks Webinar March 2024 Joshua Sinai.pptx
Slides CapTechTalks Webinar March 2024 Joshua Sinai.pptxCapitolTechU
 
In - Vivo and In - Vitro Correlation.pptx
In - Vivo and In - Vitro Correlation.pptxIn - Vivo and In - Vitro Correlation.pptx
In - Vivo and In - Vitro Correlation.pptxAditiChauhan701637
 
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdf
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdfP4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdf
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdfYu Kanazawa / Osaka University
 
A gentle introduction to Artificial Intelligence
A gentle introduction to Artificial IntelligenceA gentle introduction to Artificial Intelligence
A gentle introduction to Artificial IntelligenceApostolos Syropoulos
 
CapTechU Doctoral Presentation -March 2024 slides.pptx
CapTechU Doctoral Presentation -March 2024 slides.pptxCapTechU Doctoral Presentation -March 2024 slides.pptx
CapTechU Doctoral Presentation -March 2024 slides.pptxCapitolTechU
 
Diploma in Nursing Admission Test Question Solution 2023.pdf
Diploma in Nursing Admission Test Question Solution 2023.pdfDiploma in Nursing Admission Test Question Solution 2023.pdf
Diploma in Nursing Admission Test Question Solution 2023.pdfMohonDas
 
SOLIDE WASTE in Cameroon,,,,,,,,,,,,,,,,,,,,,,,,,,,.pptx
SOLIDE WASTE in Cameroon,,,,,,,,,,,,,,,,,,,,,,,,,,,.pptxSOLIDE WASTE in Cameroon,,,,,,,,,,,,,,,,,,,,,,,,,,,.pptx
SOLIDE WASTE in Cameroon,,,,,,,,,,,,,,,,,,,,,,,,,,,.pptxSyedNadeemGillANi
 
Quality Assurance_GOOD LABORATORY PRACTICE
Quality Assurance_GOOD LABORATORY PRACTICEQuality Assurance_GOOD LABORATORY PRACTICE
Quality Assurance_GOOD LABORATORY PRACTICESayali Powar
 

Dernier (20)

How to Add Existing Field in One2Many Tree View in Odoo 17
How to Add Existing Field in One2Many Tree View in Odoo 17How to Add Existing Field in One2Many Tree View in Odoo 17
How to Add Existing Field in One2Many Tree View in Odoo 17
 
10 Topics For MBA Project Report [HR].pdf
10 Topics For MBA Project Report [HR].pdf10 Topics For MBA Project Report [HR].pdf
10 Topics For MBA Project Report [HR].pdf
 
How to Send Emails From Odoo 17 Using Code
How to Send Emails From Odoo 17 Using CodeHow to Send Emails From Odoo 17 Using Code
How to Send Emails From Odoo 17 Using Code
 
Optical Fibre and It's Applications.pptx
Optical Fibre and It's Applications.pptxOptical Fibre and It's Applications.pptx
Optical Fibre and It's Applications.pptx
 
Vani Magazine - Quarterly Magazine of Seshadripuram Educational Trust
Vani Magazine - Quarterly Magazine of Seshadripuram Educational TrustVani Magazine - Quarterly Magazine of Seshadripuram Educational Trust
Vani Magazine - Quarterly Magazine of Seshadripuram Educational Trust
 
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx
 
How to Add a New Field in Existing Kanban View in Odoo 17
How to Add a New Field in Existing Kanban View in Odoo 17How to Add a New Field in Existing Kanban View in Odoo 17
How to Add a New Field in Existing Kanban View in Odoo 17
 
Over the counter (OTC)- Sale, rational use.pptx
Over the counter (OTC)- Sale, rational use.pptxOver the counter (OTC)- Sale, rational use.pptx
Over the counter (OTC)- Sale, rational use.pptx
 
Protein Structure - threading Protein modelling pptx
Protein Structure - threading Protein modelling pptxProtein Structure - threading Protein modelling pptx
Protein Structure - threading Protein modelling pptx
 
How to Create a Toggle Button in Odoo 17
How to Create a Toggle Button in Odoo 17How to Create a Toggle Button in Odoo 17
How to Create a Toggle Button in Odoo 17
 
Work Experience for psp3 portfolio sasha
Work Experience for psp3 portfolio sashaWork Experience for psp3 portfolio sasha
Work Experience for psp3 portfolio sasha
 
AUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptxAUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptx
 
Slides CapTechTalks Webinar March 2024 Joshua Sinai.pptx
Slides CapTechTalks Webinar March 2024 Joshua Sinai.pptxSlides CapTechTalks Webinar March 2024 Joshua Sinai.pptx
Slides CapTechTalks Webinar March 2024 Joshua Sinai.pptx
 
In - Vivo and In - Vitro Correlation.pptx
In - Vivo and In - Vitro Correlation.pptxIn - Vivo and In - Vitro Correlation.pptx
In - Vivo and In - Vitro Correlation.pptx
 
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdf
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdfP4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdf
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdf
 
A gentle introduction to Artificial Intelligence
A gentle introduction to Artificial IntelligenceA gentle introduction to Artificial Intelligence
A gentle introduction to Artificial Intelligence
 
CapTechU Doctoral Presentation -March 2024 slides.pptx
CapTechU Doctoral Presentation -March 2024 slides.pptxCapTechU Doctoral Presentation -March 2024 slides.pptx
CapTechU Doctoral Presentation -March 2024 slides.pptx
 
Diploma in Nursing Admission Test Question Solution 2023.pdf
Diploma in Nursing Admission Test Question Solution 2023.pdfDiploma in Nursing Admission Test Question Solution 2023.pdf
Diploma in Nursing Admission Test Question Solution 2023.pdf
 
SOLIDE WASTE in Cameroon,,,,,,,,,,,,,,,,,,,,,,,,,,,.pptx
SOLIDE WASTE in Cameroon,,,,,,,,,,,,,,,,,,,,,,,,,,,.pptxSOLIDE WASTE in Cameroon,,,,,,,,,,,,,,,,,,,,,,,,,,,.pptx
SOLIDE WASTE in Cameroon,,,,,,,,,,,,,,,,,,,,,,,,,,,.pptx
 
Quality Assurance_GOOD LABORATORY PRACTICE
Quality Assurance_GOOD LABORATORY PRACTICEQuality Assurance_GOOD LABORATORY PRACTICE
Quality Assurance_GOOD LABORATORY PRACTICE
 

Bug Prediction and Analysis

  • 1. Bug Prediction & Analysis Marco D’Ambros
  • 4. As users, we are used to bugs...
  • 5. ... and also as developers
  • 6. But the perception in reverse engineering is different
  • 7. But the perception in reverse engineering is different There are thousands of bugs
  • 9. Focus resources on bug-prone components Theory Prove correlations Practice with software metrics Rank components according to the bug-proneness
  • 10. Classification Class A will/won't Release x Bug prediction have bugs Ranking Class A will have more bugs than class B
  • 11. Classification Class A will/won't Release x Bug prediction have bugs Correct? Ranking Class A will have more bugs than class B
  • 12. Release x Bug prediction
  • 13. List of classes ranked by the Release x Bug prediction prediction mccabe fanout fanin wloc nom noa ona pna noc loc - - - - - - - - - - 220 2532 143936 17324 6190 95 11 345 113 954 220 2681 169205 19599 7078 132 13 450 170 1098 220 2672 170149 19616 7067 135 12 459 171 1082 228 2664 169693 19452 6982 136 12 470 171 1074 - - - - - - - - - - 68 807 67027 5944 2930 48 1 36 0 289 80 861 69064 6232 3018 59 3 59 4 313 84 1045 75448 7227 3427 52 6 92 8 338 69 990 69719 6203 2673 57 2 46 8 320 75 1334 105783 10123 3181 79 59 281 33 311 77 1528 108259 11082 3403 72 43 319 36 380 78 1674 137852 12520 4196 115 57 326 44 405 Prediction Comparison performance Release x+1 Bug extraction mccabe fanout fanin wloc nom noa ona pna noc loc - - - - - - - - - - 220 2532 143936 17324 6190 95 11 345 113 954 220 2681 169205 19599 7078 132 13 450 170 1098 220 2672 170149 19616 7067 135 12 459 171 1082 228 2664 169693 19452 6982 136 12 470 171 1074 - - - - - - - - - - 68 807 67027 5944 2930 48 1 36 0 289 80 861 69064 6232 3018 59 3 59 4 313 84 1045 75448 7227 3427 52 6 92 8 338 69 990 69719 6203 2673 57 2 46 8 320 75 1334 105783 10123 3181 79 59 281 33 311 77 1528 108259 11082 3403 72 43 319 36 380 78 1674 137852 12520 4196 115 57 326 44 405 List of classes ranked by the number of Bugzilla actual bugs database
  • 14. List of classes ranked by the Release x Bug prediction prediction mccabe fanout fanin wloc nom noa ona pna noc loc - - - - - - - - - - 220 2532 143936 17324 6190 95 11 345 113 954 220 2681 169205 19599 7078 132 13 450 170 1098 220 2672 170149 19616 7067 135 12 459 171 1082 228 2664 169693 19452 6982 136 12 470 171 1074 - - - - - - - - - - 68 807 67027 5944 2930 48 1 36 0 289 80 861 69064 6232 3018 59 3 59 4 313 84 1045 75448 7227 3427 52 6 92 8 338 69 990 69719 6203 2673 57 2 46 8 320 75 1334 105783 10123 3181 79 59 281 33 311 77 1528 108259 11082 3403 72 43 319 36 380 78 1674 137852 12520 4196 115 57 326 44 405 Svn / Cvs repository Check out Prediction Comparison performance Release x+1 Bug extraction mccabe fanout fanin wloc nom noa ona pna noc loc - - - - - - - - - - 220 2532 143936 17324 6190 95 11 345 113 954 220 2681 169205 19599 7078 132 13 450 170 1098 220 2672 170149 19616 7067 135 12 459 171 1082 228 2664 169693 19452 6982 136 12 470 171 1074 - - - - - - - - - - 68 807 67027 5944 2930 48 1 36 0 289 80 861 69064 6232 3018 59 3 59 4 313 84 1045 75448 7227 3427 52 6 92 8 338 69 990 69719 6203 2673 57 2 46 8 320 75 1334 105783 10123 3181 79 59 281 33 311 77 1528 108259 11082 3403 72 43 319 36 380 78 1674 137852 12520 4196 115 57 326 44 405 List of classes ranked by the number of Bugzilla actual bugs database
  • 15. List of classes ranked by the Release x Bug prediction prediction mccabe fanout fanin wloc nom noa ona pna noc loc - - - - - - - - - - 220 2532 143936 17324 6190 95 11 345 113 954 220 2681 169205 19599 7078 132 13 450 170 1098 220 2672 170149 19616 7067 135 12 459 171 1082 228 2664 169693 19452 6982 136 12 470 171 1074 - - - - - - - - - - 68 807 67027 5944 2930 48 1 36 0 289 80 861 69064 6232 3018 59 3 59 4 313 84 1045 75448 7227 3427 52 6 92 8 338 69 990 69719 6203 2673 57 2 46 8 320 75 1334 105783 10123 3181 79 59 281 33 311 77 1528 108259 11082 3403 72 43 319 36 380 78 1674 137852 12520 4196 115 57 326 44 405 Svn / Cvs repository Check out Prediction Comparison performance Release x+1 Bug extraction mccabe fanout fanin wloc nom noa ona pna noc loc - - - - - - - - - - 220 2532 143936 17324 6190 95 11 345 113 954 220 2681 169205 19599 7078 132 13 450 170 1098 220 2672 170149 19616 7067 135 12 459 171 1082 228 2664 169693 19452 6982 136 12 470 171 1074 - - - - - - - - - - 68 807 67027 5944 2930 48 1 36 0 289 80 861 69064 6232 3018 59 3 59 4 313 84 1045 75448 7227 3427 52 6 92 8 338 69 990 69719 6203 2673 57 2 46 8 320 75 1334 105783 10123 3181 79 59 281 33 311 77 1528 108259 11082 3403 72 43 319 36 380 78 1674 137852 12520 4196 115 57 326 44 405 List of classes ranked by the number of Bugzilla actual bugs database
  • 16. System release Parsing FAMIX Class Attribute Attribute Attribute check out Svn / Cvs Class / File repository Versioning link Inferred system logs link log Parsing Commit comments Bug reference Bug reports in the comment Bugzilla Query Parsing database Bug
  • 17. Classification Ranking Precision & recall Spearman correlation coefficient
  • 18. Classification Ranking Precision & recall Spearman correlation coefficient Buggy classes Classes predicted as buggy
  • 19. Classification Ranking Precision & recall Spearman correlation coefficient Buggy classes FN TP FP Classes predicted as buggy
  • 20. Classification Ranking Precision & recall Spearman correlation coefficient How How small FP is small FN is Buggy classes FN TP FP Classes predicted as buggy
  • 21. Classification Ranking Precision & recall Spearman correlation coefficient How How small FP is small FN is Predicted Observed Class D Class E Buggy classes Class A Class A FN TP Class E ... ~ Class D ... FP ... ... Classes predicted ... ... as buggy
  • 22. Approaches are based on: History Metrics
  • 23. Predicting Defects for Eclipse Thomas Zimmermann Rahul Premraj Andreas Zeller  Saarland University
  • 24. Experimental settings Release #Files #Packages 2.0 6740 376 2.1 7900 433 3.0 6614 429 Pre-release defects Post-release defects 6 months before/after release
  • 25. Classification of classes Using logistic regression models max recall 0.38 Buggy classes FN max precision 0.68 TP FP Classes predicted as buggy
  • 26. Ranking classes McCabe complexity 0.401 Method LOC 0.405 Total LOC 0.42 Linear regression model 0.416 Pre-release defects 0 0.25 0.50 0.75 1.00
  • 27. Ranking classes McCabe complexity 0.401 Method LOC 0.405 Total LOC 0.42 Linear regression model 0.416 Pre-release defects Pre-release defects 0.907 0 0.25 0.50 0.75 1.00
  • 28. Conclusion Past defects is the predictor for future defects
  • 29. Conclusion Software metrics Past defects is the correlate with defects but predictor for future are not usable in practice defects
  • 30. Mining metrics to predict component failures Nachiappan Nagappan Thomas Ball Microsoft Research Andreas Zeller  Saarland University
  • 31. Experimental settings Project Code size Internet Explorer 6 511 KLOC DirectX 306 KLOC Process messaging 147 KLOC component NetMeeting 109 KLOC IIS Core 37 KLOC Granularity level: module
  • 32. Experimental settings Project Code size Internet Explorer 6 511 KLOC DirectX 306 KLOC Process messaging 147 KLOC component NetMeeting 109 KLOC IIS Core 37 KLOC Granularity level: module (a binary file within Windows)
  • 33. Experimental settings Project Code size Internet Explorer 6 511 KLOC DirectX 306 KLOC Process messaging 147 KLOC component NetMeeting 109 KLOC IIS Core 37 KLOC Granularity level: module (a binary file A set of classes within Windows)
  • 34. Q1 Do complexity metrics correlate with defects?
  • 35. Q1 Do complexity metrics correlate with defects? Maximum correlation Percentage of correlated metrics 1.00 0.75 0.50 0.25 0 A B C D E
  • 36. Q2 Is there a unique set of metrics that predicts defects in all projets?
  • 37. Q3 Can we combine metrics to predict defect?
  • 38. Q3 Can we combine metrics to predict defect? Multicollinearity of metrics
  • 39. Q3 Can we combine metrics to predict defect? Principal Multicollinearity Component of metrics analysis
  • 40. Q3 Can we combine metrics to predict defect? Principal Linear/logistic Multicollinearity Component regression of metrics analysis model
  • 41. Q3 Can we combine metrics to predict defect? Principal Linear/logistic Multicollinearity Component regression of metrics analysis model Spearman/Pearson correlation Percentage of splits which correlate 1.00 0.75 0.50 0.25 0 A B C D E
  • 42. Q3 Can we combine metrics to predict defect? Principal Linear/logistic Multicollinearity Component regression of metrics analysis model Spearman/Pearson correlation Percentage of splits which correlate Too few samples 1.00 0.75 0.50 0.25 0 A B C D E
  • 43. Q4 Are predictors obtained from one project applicable to other projects?
  • 44. Conclusion Metrics can be used to predict defects
  • 45. Conclusion Metrics can be used to predict defects but
  • 46. Conclusion Metrics can be used to predict defects but they must be validated on the history
  • 47. Improving Defect Prediction Using Temporal Features and Non Linear Models Abraham Bernstein Jayalath Ekanayake Martin Pinzger University of Zurich
  • 48. Experimental settings Plugin #Years #Files updateui 7 757 updatecore 7 459 search 6.5 540 pdeui 6.5 1621 pdebuild 6 198 compare 6.5 315 Non linear models based on 21 historical metrics + LOC
  • 49. Classification of files Using decision tree learners All files: A Size(CC) Accuracy = Size(A) Correctly classified files: CC
  • 50. Classification of files Using decision tree learners All files: A Size(CC) Accuracy = Size(A) Correctly classified files: CC Best predictor (7 metrics) Accuracy 99.16%
  • 51. Ranking of files Using m5 tree regression algorithm Sperman correlation Predictor based on 7 metrics 0.966 Zimmermann’s pre-release defects 0.907 0 0.243 0.485 0.728 0.970
  • 52. Conclusion Defect prediction can be improved with: Historical information Non-linear function
  • 53. Predicting Faults Using the Complexity of Code Changes Ahmed E. Hassan Queen’s University
  • 54. ntuition is that one change affecting one file only is simpler Complexity = Entropy han one affecting many different files, as the developer who has to more changeschange has to keep trackthe entropy The perform the are distributed the higher of all them. Hassan proposed to use Shannon Entropy defined as Shannon Entropy n X Hn (P ) = − pk ∗ log2 pk (1) k=1 where pk is the probability that the file k changes during File A he considered time interval. Figure 4 shows an example with three files and three time intervals. File B File C File A Time File B t1 (2 weeks) t2 (2 weeks) t3 (2 weeks) File C
  • 55. ntuition is that one change affecting one file only is simpler Complexity = Entropy han one affecting many different files, as the developer who has to more changeschange has to keep trackthe entropy The perform the are distributed the higher of all them. Hassan proposed to use Shannon Entropy defined as Shannon Entropy n X Hn (P ) = − pk ∗ log2 pk (1) k=1 where pk is the probability that the file k changes during File A he considered time interval. Figure 4 shows an example with three files and three time intervals. File B File C File A Time File B t1 (2 weeks) t2 (2 weeks) t3 (2 weeks) FileHn(P) C =
  • 56. ntuition is that one change affecting one file only is simpler Complexity = Entropy han one affecting many different files, as the developer who has to more changeschange has to keep trackthe entropy The perform the are distributed the higher of all them. Hassan proposed to use Shannon Entropy defined as Shannon Entropy n X Hn (P ) = − pk ∗ log2 pk (1) 4 k=1 where pk is the probability that the file k changes during File A he considered time interval. Figure 4 shows an example with three files and three time intervals. File B File C File A Time File B t1 (2 weeks) t2 (2 weeks) t3 (2 weeks) FileHn(P) C =
  • 57. ntuition is that one change affecting one file only is simpler Complexity = Entropy han one affecting many different files, as the developer who has to more changeschange has to keep trackthe entropy The perform the are distributed the higher of all them. Hassan proposed to use Shannon Entropy defined as Shannon Entropy n X Hn (P ) = − pk ∗ log2 pk (1) 4 2 k=1 4 where pk is the probability that the file k changes during File A he considered time interval. Figure 4 shows an example with three files and three time intervals. File B File C File A Time File B t1 (2 weeks) t2 (2 weeks) t3 (2 weeks) FileHn(P) C = - 2 4 * log2 4 2
  • 58. ntuition is that one change affecting one file only is simpler Complexity = Entropy han one affecting many different files, as the developer who has to more changeschange has to keep trackthe entropy The perform the are distributed the higher of all them. Hassan proposed to use Shannon Entropy defined as Shannon Entropy n X Hn (P ) = − pk ∗ log2 pk (1) 4 2 k=1 4 where pk is the probability that the file k changes during File A he considered time1interval. Figure 4 shows an example 4 time intervals. with three files and three File B File C File A Time File B t1 (2 weeks) t2 (2 weeks) t3 (2 weeks) FileHn(P) C = - 2 4 * log2 4 - 1 4 * log2 4 2 1
  • 59. ntuition is that one change affecting one file only is simpler Complexity = Entropy han one affecting many different files, as the developer who has to more changeschange has to keep trackthe entropy The perform the are distributed the higher of all them. Hassan proposed to use Shannon Entropy defined as Shannon Entropy n X Hn (P ) = − pk ∗ log2 pk (1) 4 2 k=1 4 where pk is the probability that the file k changes during File A he considered time1interval. Figure 4 shows an example 4 time intervals. with three files and three File B 1 File C File A 4 Time File B t1 (2 weeks) t2 (2 weeks) t3 (2 weeks) FileHn(P) C 1 1 = - 2 4 * log2 4 - 1 4 * log2 4 - 1 4 * log 2 4 2
  • 60. ntuition is that one change affecting one file only is simpler Complexity = Entropy han one affecting many different files, as the developer who has to more changeschange has to keep trackthe entropy The perform the are distributed the higher of all them. Hassan proposed to use Shannon Entropy defined as Shannon Entropy n X Hn (P ) = − pk ∗ log2 pk (1) 4 2 k=1 4 where pk is the probability that the file k changes during File A he considered time1interval. Figure 4 shows an example 4 time intervals. with three files and three File B 1 File C File A 4 Time File B t1 (2 weeks) t2 (2 weeks) t3 (2 weeks) FileHn(P) C 1 1 = - 2 4 * log2 4 - 1 4 * log2 4 - 1 4 * log 2 4 = 1 2
  • 61. ntuition is that one change affecting one file only is simpler Complexity = Entropy han one affecting many different files, as the developer who has to more changeschange has to keep trackthe entropy The perform the are distributed the higher of all them. Hassan proposed to use Shannon Entropy defined as Shannon Entropy n X Hn (P ) = − pk ∗ log2 pk (1) k=1 where pk is the probability that H > 1? k changes during H=1 the file File A he considered time interval. Figure 4 shows an example with three files and three time intervals. File B File C File A Time File B t1 (2 weeks) t2 (2 weeks) t3 (2 weeks) File C
  • 62. ntuition is that one change affecting one file only is simpler Complexity = Entropy han one affecting many different files, as the developer who has to more changeschange has to keep trackthe entropy The perform the are distributed the higher of all them. Hassan proposed to use Shannon Entropy defined as Shannon Entropy n X Hn (P ) = − pk ∗ log2 pk (1) k=1 H=1 H > 1? where pk is the probability that the file k changes during File A he considered time interval. Figure 4 shows an example with three files and three time intervals. File B File C File A Time File B t1 (2 weeks) t2 (2 weeks) t3 (2 weeks) File C
  • 63. ned as:in the last six months). file juse H ,entropy F modified Complexity Metric (HCM) of a c ∗ the To as j∈ i Historyas bug predictor, Hassan  of Complexity Metric (HCM) e change HCP F (j) = X defined the i (j) = ij i History mplexity Metric {a,..,b} of a file j 0, ij ∗ i (j) , otherw HCM (HCM) asc HCP F H i j∈F (3) HCP Fi (j) = X i∈{a,..,b} HCM{a,..,b} (j) = 0, HCP Fi (j) other (3) e i is a.., b} is a set of evolution periods iand HCP the here {a, period with entropy H ,Set i is F is i∈{a,..,b} F of efined as: {a, b} period i and j periods andHmodified filesto re i..,is is a set of with ∗ is ,a j ∈ F HCPFiisis n the a periodevolutionentropy belongingth  file i , F cij Hi i e definition of icij , there otherwise din theHCP Fi (j) = and j is a file belonging as: period  0, are three types (4) cij ∗ Hi , j ∈ Fi he definition ofentropy there are three mod- i is a Fi (j) with0, cij , Hotherwise set of files typ here HCP period= , Fi is the (4) (1) the period i and jHis Mfilebelonging to Fentropy of co ed in cij = 1, everya file modifiedi .in the C Each file gets the According i oi the definition ofentropy Hiarei three types of HCM :the c iisgets ij with1,ijevery,of the systemmod-the a period = entropy the is the set of files in the c , there file modified in F system (1) c i and j is a file belonging to F . According n the period i interval. 1,This file modified approach: HCM definition of cijevery defines types ofconsidered in th 1. (1) cij = , entropy of the system period i gets the there areMthree in the HCM i gets the entropy of C system in the considered its H the Each file is weighted with time W defines considered period 1)interval. This approach HCM. cij = 1, every file modified in the approach HCM interval. This defines (2) the entropyjof the system in the consideredmodified gets cij = p , each modified being gets the probability of file time
  • 64. In EDHCM (Exponentially Decayed HCM) , entropies f earlier with decaytime, i.e., earlier modifications, have the HCM periods of factors contribution reduced exponentially over time, modelling a exponential decay model. EDHCM was introduced by Ha san. Similarly, LDHCM (Linearly Decayed) and LGDHC (LoGarithmically decayed), have their contributions reduc over time in a respectively linear and logarithmic fashio Both are novel. The definition of the variants follow: P HCP Fi (j) EDHCM{a,..,b} (j) = i∈{a,..,b} eφ1 ∗(|{a,..,b}|−i) ( P HCP Fi (j) LDHCM{a,..,b} (j) = i∈{a,..,b} φ2 ∗(|{a,..,b}|+1−i) ( P HCP Fi (j) LGDHCM{a,..,b} (j) = i∈{a,..,b} φ3 ∗ln(|{a,..,b}|+1.01−i) ( where φ1 , φ2 and φ3 are the decay factors.
  • 65. In EDHCM (Exponentially Decayed HCM) , entropies f earlier with decaytime, i.e., earlier modifications, have the HCM periods of factors contribution reduced exponentially over time, modelling a exponential decay model. EDHCM was introduced by Ha san. Similarly, LDHCM (Linearly Decayed) and LGDHC (LoGarithmically decayed), have their contributions reduc overExponentially decayed time in a respectively linear and logarithmic fashio Both are novel. The definition of the variants follow: P HCP Fi (j) EDHCM{a,..,b} (j) = i∈{a,..,b} eφ1 ∗(|{a,..,b}|−i) ( P HCP Fi (j) LDHCM{a,..,b} (j) = i∈{a,..,b} φ2 ∗(|{a,..,b}|+1−i) ( P HCP Fi (j) LGDHCM{a,..,b} (j) = i∈{a,..,b} φ3 ∗ln(|{a,..,b}|+1.01−i) ( where φ1 , φ2 and φ3 are the decay factors.
  • 66. In EDHCM (Exponentially Decayed HCM) , entropies f earlier with decaytime, i.e., earlier modifications, have the HCM periods of factors contribution reduced exponentially over time, modelling a exponential decay model. EDHCM was introduced by Ha san. Similarly, LDHCM (Linearly Decayed) and LGDHC (LoGarithmically decayed), have their contributions reduc overExponentially decayed time in a respectively linear and logarithmic factor Exponential fashio Both are novel. The definition of the variants follow: P HCP Fi (j) EDHCM{a,..,b} (j) = i∈{a,..,b} eφ1 ∗(|{a,..,b}|−i) ( P HCP Fi (j) LDHCM{a,..,b} (j) = i∈{a,..,b} φ2 ∗(|{a,..,b}|+1−i) ( P HCP Fi (j) LGDHCM{a,..,b} (j) = i∈{a,..,b} φ3 ∗ln(|{a,..,b}|+1.01−i) ( where φ1 , φ2 and φ3 are the decay factors.
  • 67. Experimental settings System Start date #Subsystem NetBSD March 1993 235 FreeBSD June 1993 152 OpenBSD Oct 1995 265 Postgre July 1996 280 KDE April 1997 108 KOffice April 1998 158 Entropy metrics Number of past modifications Number of past defects Subsystem level
  • 68. 2 Models fitting in terms of R Past defects Past changes HCM WHCM EDHCM 0 0.2 0.4 0.6 NetBSD FreeBSD OpenBSD Postgres KDE KOffice
  • 69. Prediction error Number of past changes vs Entropy NetBSD FreeBSD OpenBSD Postgres KDE KOffice 0 12.5 25.0 37.5 #Changes - WHCM (%) #Changes - EDHCM (%)
  • 70. Prediction error Number of past defects vs Entropy NetBSD FreeBSD OpenBSD Postgres KDE KOffice -20.0 -10.0 0 10.0 20.0 30.0 40.0 #Defects - WHCM (%) #Defects - EDHCM (%)
  • 71. Conclusion Models based on entropy of changes are better defects predictor s than number o f past changes or defects
  • 72. Conclusion Models based on entropy of changes are better defects predictor s than number o f past changes or defects A complex code change process negatively affects its product, the software system
  • 74. Epilogue Defect prediction research has been active for several year A large number of scientific papers have been published
  • 75. Epilogue We can predict defects but results have still limited practical usability
  • 76. Epilogue Predicting bugs is very difficult because developing code is a human activity
  • 77. Epilogue A human activity influenced by too many factors How complex was the piece of code? How tested? How experienced was the developer?
  • 78. Epilogue A human activity influenced by too many factors How complex was the piece of code? How tested? How experienced was the developer? How tired was the developer? How integrated was the developer in the team? Did he like his job?
  • 79. Epilogue A human activity influenced by too many factors F OC US How complex was the piece of code? How tested? How experienced was the developer? How tired was the developer? How integrated was the developer in the team? Did he like his job?
  • 80. Epilogue A human activity influenced by too many factors F OC US How complex was the piece of code? How tested? How experienced was the developer? od Hata ow tired was the developer? N y etintegrated was the developer in the team? How Did he like his job?
  • 82. Detect the critical bugs properties of components number of bugs
  • 83. Detect the critical bugs properties of components number of bugs
  • 84. Detect the critical components number of bugs properties of bugs
  • 85. bugzero bugzilla census customerfirst defect-agent extraview-bug-tracker fast- bugtrack fogbugz gnats ibm- rational-clearquest ictracker issue- organizer issuenet-intercept issueview jira legendsoft-spots mantis new-fire omnitracker pointinsight pr-tracker problemtracker quickbugs radar razor rmtrack-bug-tracking
  • 86. 4 facts about bugs
  • 87. Bugs are differently harmful Blocker Critical Major Normal Minor Trivial Enhancement
  • 88. Bugs are differently harmful Blocker Critical Bugzil la is used to repor t Major gs buNormal and change requests Minor Trivial Enhancement
  • 89. Bugs are differently harmful Blocker Critical Bugzil la is used to repor t Major gs buNormal and change requests Minor Trivial Enhancement
  • 92. An ideal bug life cycle Unconfirmed
  • 93. An ideal bug life cycle Unconfirmed Verified New Resolved Closed Assigned
  • 94. A bit less ideal Unconfirmed Verified New Resolved Closed Assigned
  • 95. A bit less ideal Unconfirmed Verified New Resolved Closed Assigned Reopened
  • 96. The reality Unconfirmed Verified New Resolved Closed Assigned Reopened
  • 97. The reality Unconfirmed Verified New Resolved Closed Assigned Reopened
  • 98. All bug properties can change over time Bug Problem id description product component Criticality severity priority Involved people assignedTo reporter qa State Status Resolution ...
  • 99. All bug properties can change over time Bug Bug Problem Problem id description id description product component product component Criticality Activity Criticality severity priority severity priority Involved people Involved people steve assignedTo reporter qa AssignedTo mike assignedTo reporter qa State steve john State Status Resolution Status Resolution ... ...
  • 100. All bug properties can change over time Bug Bug Problem Problem id description id description product component product component Criticality Activity Criticality severity priority severity priority Involved people Involved people steve assignedTo reporter qa AssignedTo mike assignedTo reporter qa State steve john State Status Resolution Status Resolution ... ... i B P de i B P de i B P de i B P de Bug history C C C C Inv Inv Inv Inv S SR S SR S SR S SR
  • 101. Are there many activities? How long do they live?
  • 102. Are there many activities? How long do they live? Time period Sep 1998 - Apr 2003 #Bugs 255’302 #Activities 2’706’201
  • 103. Number of activities 30% 25% 20% 15% 10% 5% 0% 0 1-3 4-5 6-10 11-20 21-30 > 30
  • 104. Lifetime (reported - last activity) 40% 32% 24% 16% 8% 0% 12 Hours 1 Day 1 Week 1 Month 6 Months 1 Year 2 Years More
  • 105. Lifetime (reported - last activity) 40% 32% > 50% 24% 16% 8% 0% 12 Hours 1 Day 1 Week 1 Month 6 Months 1 Year 2 Years More
  • 106. Bugs have long and intense lives
  • 107. 4 facts about bugs are are evolves have differently graphs long and harmful intense lives
  • 108. There is a need of analyzing bug repositories Analyzing bugs as evolving entities
  • 109. “A Bug’s Life” Visualizing a Bug Database Marco D’Ambros Michele Lanza Martin Pinzger
  • 110. System radiography view “Where (in the system and in its history) are the open bugs located?”
  • 111. System radiography view “Where (in the system and in its history) are the open bugs located?” Visualization principle •System decomposition on the Component 1 y axis Product A Component 2 •Product :: Component Product B Time
  • 112. System radiography view “Where (in the system and in its history) are the open bugs located?” Visualization principle •System decomposition on the Component 1 y axis Product A Component 2 y position Color #bugs •Product :: Component • (x,y) : (time, component) Component Product B x position • Color: # open bugs Time Interval Time
  • 113. System radiography view “Where (in the system and in its history) are the open bugs located?” Visualization principle •System decomposition on the Component 1 y axis Product A Component 2 y position Color #bugs •Product :: Component • (x,y) : (time, component) Component Product B x position • Color: # open bugs Time Interval Time
  • 114. Mozilla example [Sep ‘98 - Apr ‘03] aggiungere transizione alla prossima slide, volendo anche nel filmato
  • 115. Mozilla example [Sep ‘98 - Apr ‘03] aggiungere transizione alla prossima slide, volendo anche nel filmato Browser
  • 116. Mozilla example [Sep ‘98 - Apr ‘03] aggiungere transizione alla prossima slide, volendo anche nel filmato Browser Mailnews
  • 117. Mozilla example [Sep ‘98 - Apr ‘03] aggiungere transizione alla prossima slide, volendo anche nel filmato Browser Mailnews
  • 118. The Bug Watch View “How are bugs characterized with respect to their history?”
  • 119. The Bug Watch View “How are bugs characterized with respect to their history?” Visualization principle End: 10/16/2001 Beginning: 10/19/1999 Time
  • 120. The Bug Watch View “How are bugs characterized with respect to their history?” Visualization principle End: 10/16/2001 Beginning: 10/19/1999 • 3 Layers Time
  • 121. The Bug Watch View “How are bugs characterized with respect to their history?” Visualization principle End: 10/16/2001 Beginning: 10/19/1999 • 3 Layers Time • Status
  • 122. The Bug Watch View “How are bugs characterized with respect to their history?” Visualization principle End: 10/16/2001 Beginning: 10/19/1999 • 3 Layers Time • Status Status From To Assigned 10/19/99 12/21/99 Resolved 12/21/99 1/31/00 Reopened 1/31/00 2/6/00 New 2/6/00 6/5/00 ... ... ...
  • 123. The Bug Watch View “How are bugs characterized with respect to their history?” Visualization principle End: 10/16/2001 Beginning: 10/19/1999 • 3 Layers Time • Status Status From To Assigned 10/19/99 12/21/99 Resolved 12/21/99 1/31/00 Reopened 1/31/00 2/6/00 New 2/6/00 6/5/00 ... ... ...
  • 124. The Bug Watch View “How are bugs characterized with respect to their history?” Visualization principle End: 10/16/2001 Beginning: 10/19/1999 • 3 Layers Time • Status Status From To Assigned 10/19/99 12/21/99 Resolved 12/21/99 1/31/00 Reopened 1/31/00 2/6/00 New 2/6/00 6/5/00 ... ... ...
  • 125. The Bug Watch View “How are bugs characterized with respect to their history?” Visualization principle End: 10/16/2001 Beginning: 10/19/1999 • 3 Layers Time • Status Status From To Assigned 10/19/99 12/21/99 Resolved 12/21/99 1/31/00 Reopened 1/31/00 2/6/00 New 2/6/00 6/5/00 ... ... ...
  • 126. The Bug Watch View “How are bugs characterized with respect to their history?” Visualization principle End: 10/16/2001 Beginning: 10/19/1999 • 3 Layers Time • Status Status From To Assigned 10/19/99 12/21/99 Resolved 12/21/99 1/31/00 Reopened 1/31/00 2/6/00 New 2/6/00 6/5/00 ... ... ...
  • 127. The Bug Watch View “How are bugs characterized with respect to their history?” Visualization principle End: 10/16/2001 Beginning: 10/19/1999 • 3 Layers Time • Status Status From To Assigned 10/19/99 12/21/99 Resolved 12/21/99 1/31/00 Reopened 1/31/00 2/6/00 New 2/6/00 6/5/00 ... ... ... • Activity
  • 128. The Bug Watch View “How are bugs characterized with respect to their history?” Visualization principle End: 10/16/2001 Beginning: 10/19/1999 • 3 Layers Time • Status Status From To Assigned 10/19/99 12/21/99 Resolved 12/21/99 1/31/00 Reopened 1/31/00 2/6/00 New 2/6/00 6/5/00 ... ... ... • Activity • Severity
  • 129. tell more about the Examples from Mozilla clustering dire cosa e’ la grandezza Browser :: Networking [Nov ‘02- Apr ‘03]
  • 130. tell more about the Examples from Mozilla clustering dire cosa e’ la grandezza Browser :: Networking [Nov ‘02- Apr ‘03] Reopened 4 times Developer in charge to fix it changed 6 times Many people added in the CC
  • 131. tell more about the Examples from Mozilla clustering dire cosa e’ la grandezza Browser :: Networking [Nov ‘02- Apr ‘03]
  • 132. tell more about the Examples from Mozilla clustering dire cosa e’ la grandezza Browser :: Networking [Nov ‘02- Apr ‘03] One status but many activities (addition of CC)
  • 133. Conclusion Analyzing a bug database Provides useful insights in a software system Helps in detecting the most harmful bugs
  • 135. Epilogue We are just touching the surface The analysis of bug repositories is still a very open field