SlideShare une entreprise Scribd logo
1  sur  70
Télécharger pour lire hors ligne
Studying the Fix-Time for
                Bugs in Large Open Source
                          Projects
                              Lionel Marks, Ying Zou, Ahmed E. Hassan,
                                         Thanh Nguyen
                                   Queen’s University, Kingston, Ontario, Canada




                                                        1
Wednesday, 21 September, 11
If life was like that
                                      we don’t need
                                  software prediction.




                              2
Wednesday, 21 September, 11
Reality
                      Many simple feature
                       requests or defect
                      reports do NOT get
                         fixed for years.




                                            3
Wednesday, 21 September, 11
4
Wednesday, 21 September, 11
Feature request /
       Defect report filed




                              4
Wednesday, 21 September, 11
Feature request /
       Defect report filed
                              Triage




                                       4
Wednesday, 21 September, 11
Implementation
                                       plan / cause of defect
       Feature request /
                                            determined
       Defect report filed
                              Triage




                                                                4
Wednesday, 21 September, 11
Implementation
                                       plan / cause of defect
       Feature request /
                                            determined
       Defect report filed
                              Triage

                                                                    Implement




                                                                4
Wednesday, 21 September, 11
Implementation
                                       plan / cause of defect
       Feature request /
                                            determined
       Defect report filed
                              Triage                                            Verify

                                                                    Implement




                                                                4
Wednesday, 21 September, 11
Implementation
                                       plan / cause of defect
       Feature request /                                                                 Close
                                            determined
       Defect report filed
                              Triage                                            Verify

                                                                    Implement




                                                                4
Wednesday, 21 September, 11
Implementation
                                       plan / cause of defect
       Feature request /                                                                 Close
                                            determined
       Defect report filed
                              Triage                                            Verify

                                                                    Implement




                                             Work item fix-time




                                                                4
Wednesday, 21 September, 11
Implementation
                                                plan / cause of defect
       Feature request /                                                                          Close
                                                     determined
       Defect report filed
                                  Triage                                                 Verify

                                                                             Implement




                                                      Work item fix-time
                              When will it be
                                 fixed?




                                                                         4
Wednesday, 21 September, 11
Implementation
                                                plan / cause of defect
       Feature request /                                                                                          Close
                                                     determined
       Defect report filed
                                  Triage                                                 Verify

                                                                             Implement




                                                      Work item fix-time
                              When will it be                                                    Which
                                 fixed?                                                      one should we fix
                                                                                                this iteration?




                                                                         4
Wednesday, 21 September, 11
Implementation
                                                plan / cause of defect
       Feature request /                                                                                          Close
                                                     determined
       Defect report filed
                                  Triage                                                 Verify

                                                                             Implement




                                                      Work item fix-time
                              When will it be                                                    Which
                                 fixed?                                                      one should we fix
                                                                                                this iteration?
                                                  Can we predict the
                                                  work item fix-time?




                                                                         4
Wednesday, 21 September, 11
Location
                 properties
                    (7)




                              5
Wednesday, 21 September, 11
Product /
           Version /
          Component




                  Location
                 properties
                    (7)




                              5
Wednesday, 21 September, 11
Product /             Number
           Version /          of completed
          Component               WI*




                  Location
                 properties
                    (7)




                                             5
Wednesday, 21 September, 11
Product /             Number
           Version /          of completed
          Component               WI*
                                             Average fix
                                               time*

                  Location
                 properties
                    (7)




                                                          5
Wednesday, 21 September, 11
Location
                 properties
                    (7)

                 Reporter
                 properties
                    (4)




                              5
Wednesday, 21 September, 11
Location
                 properties     Industry /
                              local / public
                    (7)

                 Reporter
                 properties
                    (4)




                                               5
Wednesday, 21 September, 11
Location
                 properties     Industry /
                              local / public
                    (7)
                                  Popularity*
                 Reporter
                 properties
                    (4)




                                                5
Wednesday, 21 September, 11
Location
                 properties     Industry /
                              local / public
                    (7)
                                  Popularity*
                 Reporter
                 properties       Number
                    (4)           of past
                                 requests*




                                                5
Wednesday, 21 September, 11
Location
                 properties     Industry /
                              local / public
                    (7)
                                  Popularity*
                 Reporter
                 properties       Number
                    (4)           of past
                                 requests*



                                Average fix
                                  time*




                                                5
Wednesday, 21 September, 11
Location
                 properties
                    (7)

                 Reporter
                 properties
                    (4)

                Work item
                properties
                   (12)

                              5
Wednesday, 21 September, 11
Location
                 properties
                    (7)

                 Reporter
                 properties   Severity /
                               Priority
                    (4)

                Work item
                properties
                   (12)

                                           5
Wednesday, 21 September, 11
Location
                 properties
                    (7)

                 Reporter
                 properties   Severity /
                               Priority
                    (4)
                                         Number
                                       of interested
                                          parties*
                Work item
                properties
                   (12)

                                                5
Wednesday, 21 September, 11
Location
                 properties
                    (7)

                 Reporter
                 properties   Severity /
                               Priority
                    (4)
                                         Number
                                       of interested
                                          parties*
                Work item
                properties                 Morning /
                                           Day / night
                   (12)

                                                  5
Wednesday, 21 September, 11
Location
                 properties
                    (7)

                 Reporter
                 properties   Severity /
                               Priority
                    (4)
                                         Number
                                       of interested
                                          parties*
                Work item
                properties                 Morning /
                                           Day / night
                   (12)
                                                  Description
                                                    length*
                                                  5
Wednesday, 21 September, 11
Location
                 properties
                    (7)

                 Reporter
                 properties       Severity /
                                   Priority
                    (4)
                                             Number
                                           of interested
                                              parties*
                Work item
                properties                     Morning /
                                               Day / night
                   (12)
                                 Code                 Description
                              attachment                length*
                                                      5
Wednesday, 21 September, 11
Location
                 properties
                    (7)

                 Reporter
                              Predictor
                 properties
                    (4)

                Work item
                properties
                   (12)

                                  5
Wednesday, 21 September, 11
When will it be
                                            fixed?




                  Location
                 properties
                    (7)

                 Reporter
                              Predictor
                 properties
                    (4)

                Work item
                properties
                   (12)

                                  6
Wednesday, 21 September, 11
When will it be
                                            fixed?




                  Location
                                                            Short
                 properties
                    (7)

                 Reporter
                              Predictor
                 properties
                    (4)

                Work item
                properties
                   (12)

                                  6
Wednesday, 21 September, 11
When will it be
                                            fixed?




                  Location
                                                            Short
                 properties
                    (7)

                 Reporter
                              Predictor                Normal
                 properties
                    (4)

                Work item
                properties
                   (12)

                                  6
Wednesday, 21 September, 11
When will it be
                                            fixed?




                  Location
                                                            Short
                 properties
                    (7)

                 Reporter
                              Predictor                Normal
                 properties
                    (4)

                Work item
                                                            Long
                properties
                   (12)

                                  6
Wednesday, 21 September, 11
Which
                              one should we fix
                                  this iteration?




                  Location
                 properties
                    (7)

                 Reporter
                                              Predictor
                 properties
                    (4)

                Work item
                properties
                   (12)

                                                    7
Wednesday, 21 September, 11
Which
                              one should we fix
                                  this iteration?




                  Location                                Next minor
                 properties                                revision
                    (7)

                 Reporter
                                              Predictor
                 properties
                    (4)

                Work item
                properties
                   (12)

                                                    7
Wednesday, 21 September, 11
Which
                              one should we fix
                                  this iteration?




                  Location                                Next minor
                 properties                                revision
                    (7)

                 Reporter                                 Next major
                                              Predictor
                 properties                                revision
                    (4)

                Work item
                properties
                   (12)

                                                    7
Wednesday, 21 September, 11
Which
                              one should we fix
                                  this iteration?




                  Location                                Next minor
                 properties                                revision
                    (7)

                 Reporter                                 Next major
                                              Predictor
                 properties                                revision
                    (4)

                Work item                                   Next
                properties                                 version
                   (12)

                                                    7
Wednesday, 21 September, 11
Case study
                    Number of
                               <3             <1     <3
            Project   work
                              months         year   years
                      items

               Mozilla        85,616   46%   27%    27%


               Eclipse        63,402   76%   18%     6%



                                       8
Wednesday, 21 September, 11
Random Forest
                       We use Random Forest because:
                         • Decision tree based models are explainable
                              comparing to SVM or neural network.
                         • Random Forest out-performs C4.5 because
                              it is more resistive to data with highly
                              correlated attributes.
                         • It is easy to analyze the sensitivity of each
                              property.

                                                 9
Wednesday, 21 September, 11
Random
                              Data                     J48                                                  M5P
                                                                              Forest

                          Linux*              11.09 (3.09) 18.42 (2.53) 16.22(2.19)


                        Apache*               38.64 (1.35) 34.45 (1.68) 25.39 (1.64)


                              Jazz            17.75 (2.76) 18.43 (2.74) 11.67 (2.27)

                   *Akinori Ihara (Kinki University, Japan) and Yasutaka Kamei (Kyushu University, Japan)




                                                                       10
Wednesday, 21 September, 11
Goals of our case study

                   • G1: What is the accuracy of fix-time
                          prediction model?
                   • G2: Which properties are the most
                          important predictors of fix-time?
                   • G3: How applicable are models in practice?

                                              11
Wednesday, 21 September, 11
G1: Accuracy of the
                                    model
                      We build 10 random forests for each
                      project:
                         • Each random forest uses randomly 2/3 of
                              the data for training.
                         • We evaluate the prediction on the rest
                              1/3 of the data


                                                12
Wednesday, 21 September, 11
Accuracy of the model:
               Overall misclassification




                              13
Wednesday, 21 September, 11
Accuracy of the model:
               Overall misclassification
    60
               51.9
                                          50.8
    45
                                39.6
                                                     36.2
    30


    15


      0
          Reporter            Location Description   All


                                Mozilla
                                                            13
Wednesday, 21 September, 11
Accuracy of the model:
               Overall misclassification
    60
                                                                                                    50
               51.9
                                          50.8
                                                                                   43
    45
                                                                                                    37.5
                                39.6                             35.8    34.7
                                                     36.2                                    32.8
    30
                                                                                                    25


    15
                                                                                                    12.5


      0                                                                                             0
          Reporter            Location Description   All     Reporter Location Description
                                                                                             All

                                Mozilla                                         Eclipse
                                                            13
Wednesday, 21 September, 11
Accuracy of the model:
                      G1: We can correctly classify ∼65% of the
                      time the fix-time for work items in Eclipse and
               Overall misclassification
                      Mozilla, twice as good as random.

    60
                                                                                                    50
               51.9
                                          50.8
                                                                                   43
    45
                                                                                                    37.5
                                39.6                             35.8    34.7
                                                     36.2                                    32.8
    30
                                                                                                    25


    15
                                                                                                    12.5


      0                                                                                             0
          Reporter            Location Description   All     Reporter Location Description
                                                                                             All

                                Mozilla                                         Eclipse
                                                            13
Wednesday, 21 September, 11
G2: Model sensitivity -
                Importance of each property
                      We use technique called permutation
                      accuracy importance measure as follow:
                         • For each property, we randomly alter
                              values and rerun the classification.
                         • We give an importance score (1 to 10)
                              depend on the change in the classification
                              result.
                         • We sum its score across all ten forests.
                                                14
Wednesday, 21 September, 11
Location
                    Mozilla              Eclipse



                                   Project fix-time
                 Product
                                   Product opened
                Component
                                     work items




                                                     15
Wednesday, 21 September, 11
Location                              Reporter
                    Mozilla              Eclipse          Mozilla              Eclipse



                                   Project fix-time        Fix-time          Fix-time
                 Product
                                   Product opened         Requests      Overall popularity
                Component
                                     work items




                                                     15
Wednesday, 21 September, 11
Location                               Reporter
                    Mozilla               Eclipse           Mozilla              Eclipse



                                     Project fix-time        Fix-time          Fix-time
                 Product
                                     Product opened         Requests      Overall popularity
                Component
                                       work items




                              Description
                    Mozilla               Eclipse


                 Year                       Year
         Has target milestone             Severity


                                                       15
Wednesday, 21 September, 11
Location                               Reporter
                    Mozilla               Eclipse           Mozilla              Eclipse



                                     Project fix-time        Fix-time          Fix-time
                 Product
                                     Product opened         Requests      Overall popularity
                Component
                                       work items




                              Description                               All
                    Mozilla               Eclipse           Mozilla              Eclipse


                 Year                       Year              Year           Severity
         Has target milestone             Severity          Product       Number of CCed


                                                       15
Wednesday, 21 September, 11
Location                              Reporter
                    Mozilla              Eclipse          Mozilla              Eclipse



                                   Project fix-time        Fix-time          Fix-time
                 Product
                                   Product opened         Requests      Overall popularity
                Component
                                     work items
                          G2: The time of bug filing and its
                          location are the most important
                          Description in the Mozilla project. In the
                          properties                     All
                          Eclipse project, bug severity is the
                    Mozilla         Eclipse      Mozilla       Eclipse
                          most important property.
                 Year                      Year             Year           Severity
         Has target milestone            Severity         Product       Number of CCed


                                                     15
Wednesday, 21 September, 11
(18%)   (7%)   (0%)   (7

                  Why is time the the Resolution Type of Bu
                        Table 5: Statistics for
                                                most
                important factor in Mozilla?




                               (a) Mozilla
                                   16
Wednesday, 21 September, 11
)               (7%)              (9%)   (59%) (100%)

               Why is timeProjects most
ype of Bugs for the Mozilla and Eclipse
                                        the
        important factor in Mozilla?




                                         (b) Eclipse
                                             17
    Wednesday, 21 September, 11
G3: How applicable are
                  models in practice?

                      If a prediction model is stable, it should:
                              • Use only available properties
                              • Be stable

                                               18
Wednesday, 21 September, 11
Implementation
                                          plan / cause of defect
            Feature request /                                                           Close
                                               determined
            Defect report filed
                                 Triage                                        Verify

                                                                   Implement




                                                Work item fix-time




Wednesday, 21 September, 11
Feature request /
            Defect report filed
                                 Triage




                                          Work item fix-time




Wednesday, 21 September, 11
Number of CCed?




            Feature request /
            Defect report filed
                                 Triage




                                          Work item fix-time




Wednesday, 21 September, 11
Number of CCed?



                                                        Serverity change!
            Feature request /
            Defect report filed
                                 Triage




                                          Work item fix-time




Wednesday, 21 September, 11
Assigned to
                                           Number of CCed?             someone else



                                                        Serverity change!
            Feature request /
            Defect report filed
                                 Triage




                                          Work item fix-time




Wednesday, 21 September, 11
Accuracy of the predictor models
                 using only available properties
                                                             Data size                 Misclassification
                                                                                             rate
                                Eclipse*                        86490                              0.51

                                 Linux*                          2024                              0.55

                                    Jazz                        16672                              0.57

                               Apache*                           1466                              0.37
                          *Akinori Ihara (Kinki University, Japan) and Yasutaka Kamei (Kyushu University, Japan)



                                                                        20
Wednesday, 21 September, 11
Stability of the mode

               • Training size stability: As more data is added
                      to the training set, the accuracy should
                      improve.
               • Time stability: The accuracy should be stable
                      overtime.



                                           21
Wednesday, 21 September, 11
Apache - Training size
                          0.70
                          0.65
                          0.60
                                       stability
             Precisions

                          0.55
                          0.50
                          0.45




                                   1   2   3     4         5     6      7   8   9

                                               x 10% of training data


                                                      22
Wednesday, 21 September, 11
Apache - Time stability
               Precisions

                            0.42
                            0.40
                            0.38
                            0.36
                            0.34




                                   1   2   3        4        5          6   7   8

                                               x 10% of training data

                                                        23
Wednesday, 21 September, 11
Apache - Time stability
                                   G3: Fix-time prediction model may
                            0.42




                                   work on project such as Apache in
                            0.40




                                   practice. Apache prediction model
               Precisions




                                   have data stability and time
                            0.38




                                   stability.
                            0.36
                            0.34




                                      1   2    3        4        5          6   7   8

                                                   x 10% of training data

                                                            23
Wednesday, 21 September, 11
24
Wednesday, 21 September, 11
24
Wednesday, 21 September, 11
24
Wednesday, 21 September, 11
24
Wednesday, 21 September, 11

Contenu connexe

Plus de CS, NcState

Icse15 Tech-briefing Data Science
Icse15 Tech-briefing Data ScienceIcse15 Tech-briefing Data Science
Icse15 Tech-briefing Data ScienceCS, NcState
 
Kits to Find the Bits that Fits
Kits to Find  the Bits that Fits Kits to Find  the Bits that Fits
Kits to Find the Bits that Fits CS, NcState
 
Ai4se lab template
Ai4se lab templateAi4se lab template
Ai4se lab templateCS, NcState
 
Automated Software Enging, Fall 2015, NCSU
Automated Software Enging, Fall 2015, NCSUAutomated Software Enging, Fall 2015, NCSU
Automated Software Enging, Fall 2015, NCSUCS, NcState
 
Requirements Engineering
Requirements EngineeringRequirements Engineering
Requirements EngineeringCS, NcState
 
172529main ken and_tim_software_assurance_research_at_west_virginia
172529main ken and_tim_software_assurance_research_at_west_virginia172529main ken and_tim_software_assurance_research_at_west_virginia
172529main ken and_tim_software_assurance_research_at_west_virginiaCS, NcState
 
Automated Software Engineering
Automated Software EngineeringAutomated Software Engineering
Automated Software EngineeringCS, NcState
 
Next Generation “Treatment Learning” (finding the diamonds in the dust)
Next Generation “Treatment Learning” (finding the diamonds in the dust)Next Generation “Treatment Learning” (finding the diamonds in the dust)
Next Generation “Treatment Learning” (finding the diamonds in the dust)CS, NcState
 
Tim Menzies, directions in Data Science
Tim Menzies, directions in Data ScienceTim Menzies, directions in Data Science
Tim Menzies, directions in Data ScienceCS, NcState
 
Dagstuhl14 intro-v1
Dagstuhl14 intro-v1Dagstuhl14 intro-v1
Dagstuhl14 intro-v1CS, NcState
 
The Art and Science of Analyzing Software Data
The Art and Science of Analyzing Software DataThe Art and Science of Analyzing Software Data
The Art and Science of Analyzing Software DataCS, NcState
 
What Metrics Matter?
What Metrics Matter? What Metrics Matter?
What Metrics Matter? CS, NcState
 
In the age of Big Data, what role for Software Engineers?
In the age of Big Data, what role for Software Engineers?In the age of Big Data, what role for Software Engineers?
In the age of Big Data, what role for Software Engineers?CS, NcState
 
Sayyad slides ase13_v4
Sayyad slides ase13_v4Sayyad slides ase13_v4
Sayyad slides ase13_v4CS, NcState
 
Warning: don't do CS
Warning: don't do CSWarning: don't do CS
Warning: don't do CSCS, NcState
 
How to do better experiments in SE
How to do better experiments in SEHow to do better experiments in SE
How to do better experiments in SECS, NcState
 
Idea Engineering
Idea EngineeringIdea Engineering
Idea EngineeringCS, NcState
 

Plus de CS, NcState (20)

Icse15 Tech-briefing Data Science
Icse15 Tech-briefing Data ScienceIcse15 Tech-briefing Data Science
Icse15 Tech-briefing Data Science
 
Kits to Find the Bits that Fits
Kits to Find  the Bits that Fits Kits to Find  the Bits that Fits
Kits to Find the Bits that Fits
 
Ai4se lab template
Ai4se lab templateAi4se lab template
Ai4se lab template
 
Automated Software Enging, Fall 2015, NCSU
Automated Software Enging, Fall 2015, NCSUAutomated Software Enging, Fall 2015, NCSU
Automated Software Enging, Fall 2015, NCSU
 
Requirements Engineering
Requirements EngineeringRequirements Engineering
Requirements Engineering
 
172529main ken and_tim_software_assurance_research_at_west_virginia
172529main ken and_tim_software_assurance_research_at_west_virginia172529main ken and_tim_software_assurance_research_at_west_virginia
172529main ken and_tim_software_assurance_research_at_west_virginia
 
Automated Software Engineering
Automated Software EngineeringAutomated Software Engineering
Automated Software Engineering
 
Next Generation “Treatment Learning” (finding the diamonds in the dust)
Next Generation “Treatment Learning” (finding the diamonds in the dust)Next Generation “Treatment Learning” (finding the diamonds in the dust)
Next Generation “Treatment Learning” (finding the diamonds in the dust)
 
Tim Menzies, directions in Data Science
Tim Menzies, directions in Data ScienceTim Menzies, directions in Data Science
Tim Menzies, directions in Data Science
 
Goldrush
GoldrushGoldrush
Goldrush
 
Dagstuhl14 intro-v1
Dagstuhl14 intro-v1Dagstuhl14 intro-v1
Dagstuhl14 intro-v1
 
Know thy tools
Know thy toolsKnow thy tools
Know thy tools
 
The Art and Science of Analyzing Software Data
The Art and Science of Analyzing Software DataThe Art and Science of Analyzing Software Data
The Art and Science of Analyzing Software Data
 
What Metrics Matter?
What Metrics Matter? What Metrics Matter?
What Metrics Matter?
 
In the age of Big Data, what role for Software Engineers?
In the age of Big Data, what role for Software Engineers?In the age of Big Data, what role for Software Engineers?
In the age of Big Data, what role for Software Engineers?
 
Sayyad slides ase13_v4
Sayyad slides ase13_v4Sayyad slides ase13_v4
Sayyad slides ase13_v4
 
Ase2013
Ase2013Ase2013
Ase2013
 
Warning: don't do CS
Warning: don't do CSWarning: don't do CS
Warning: don't do CS
 
How to do better experiments in SE
How to do better experiments in SEHow to do better experiments in SE
How to do better experiments in SE
 
Idea Engineering
Idea EngineeringIdea Engineering
Idea Engineering
 

Dernier

What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 

Dernier (20)

What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 

Promise 2011: "Studying the Fix-Time for Bugs in Large Open Source Projects"

  • 1. Studying the Fix-Time for Bugs in Large Open Source Projects Lionel Marks, Ying Zou, Ahmed E. Hassan, Thanh Nguyen Queen’s University, Kingston, Ontario, Canada 1 Wednesday, 21 September, 11
  • 2. If life was like that we don’t need software prediction. 2 Wednesday, 21 September, 11
  • 3. Reality Many simple feature requests or defect reports do NOT get fixed for years. 3 Wednesday, 21 September, 11
  • 5. Feature request / Defect report filed 4 Wednesday, 21 September, 11
  • 6. Feature request / Defect report filed Triage 4 Wednesday, 21 September, 11
  • 7. Implementation plan / cause of defect Feature request / determined Defect report filed Triage 4 Wednesday, 21 September, 11
  • 8. Implementation plan / cause of defect Feature request / determined Defect report filed Triage Implement 4 Wednesday, 21 September, 11
  • 9. Implementation plan / cause of defect Feature request / determined Defect report filed Triage Verify Implement 4 Wednesday, 21 September, 11
  • 10. Implementation plan / cause of defect Feature request / Close determined Defect report filed Triage Verify Implement 4 Wednesday, 21 September, 11
  • 11. Implementation plan / cause of defect Feature request / Close determined Defect report filed Triage Verify Implement Work item fix-time 4 Wednesday, 21 September, 11
  • 12. Implementation plan / cause of defect Feature request / Close determined Defect report filed Triage Verify Implement Work item fix-time When will it be fixed? 4 Wednesday, 21 September, 11
  • 13. Implementation plan / cause of defect Feature request / Close determined Defect report filed Triage Verify Implement Work item fix-time When will it be Which fixed? one should we fix this iteration? 4 Wednesday, 21 September, 11
  • 14. Implementation plan / cause of defect Feature request / Close determined Defect report filed Triage Verify Implement Work item fix-time When will it be Which fixed? one should we fix this iteration? Can we predict the work item fix-time? 4 Wednesday, 21 September, 11
  • 15. Location properties (7) 5 Wednesday, 21 September, 11
  • 16. Product / Version / Component Location properties (7) 5 Wednesday, 21 September, 11
  • 17. Product / Number Version / of completed Component WI* Location properties (7) 5 Wednesday, 21 September, 11
  • 18. Product / Number Version / of completed Component WI* Average fix time* Location properties (7) 5 Wednesday, 21 September, 11
  • 19. Location properties (7) Reporter properties (4) 5 Wednesday, 21 September, 11
  • 20. Location properties Industry / local / public (7) Reporter properties (4) 5 Wednesday, 21 September, 11
  • 21. Location properties Industry / local / public (7) Popularity* Reporter properties (4) 5 Wednesday, 21 September, 11
  • 22. Location properties Industry / local / public (7) Popularity* Reporter properties Number (4) of past requests* 5 Wednesday, 21 September, 11
  • 23. Location properties Industry / local / public (7) Popularity* Reporter properties Number (4) of past requests* Average fix time* 5 Wednesday, 21 September, 11
  • 24. Location properties (7) Reporter properties (4) Work item properties (12) 5 Wednesday, 21 September, 11
  • 25. Location properties (7) Reporter properties Severity / Priority (4) Work item properties (12) 5 Wednesday, 21 September, 11
  • 26. Location properties (7) Reporter properties Severity / Priority (4) Number of interested parties* Work item properties (12) 5 Wednesday, 21 September, 11
  • 27. Location properties (7) Reporter properties Severity / Priority (4) Number of interested parties* Work item properties Morning / Day / night (12) 5 Wednesday, 21 September, 11
  • 28. Location properties (7) Reporter properties Severity / Priority (4) Number of interested parties* Work item properties Morning / Day / night (12) Description length* 5 Wednesday, 21 September, 11
  • 29. Location properties (7) Reporter properties Severity / Priority (4) Number of interested parties* Work item properties Morning / Day / night (12) Code Description attachment length* 5 Wednesday, 21 September, 11
  • 30. Location properties (7) Reporter Predictor properties (4) Work item properties (12) 5 Wednesday, 21 September, 11
  • 31. When will it be fixed? Location properties (7) Reporter Predictor properties (4) Work item properties (12) 6 Wednesday, 21 September, 11
  • 32. When will it be fixed? Location Short properties (7) Reporter Predictor properties (4) Work item properties (12) 6 Wednesday, 21 September, 11
  • 33. When will it be fixed? Location Short properties (7) Reporter Predictor Normal properties (4) Work item properties (12) 6 Wednesday, 21 September, 11
  • 34. When will it be fixed? Location Short properties (7) Reporter Predictor Normal properties (4) Work item Long properties (12) 6 Wednesday, 21 September, 11
  • 35. Which one should we fix this iteration? Location properties (7) Reporter Predictor properties (4) Work item properties (12) 7 Wednesday, 21 September, 11
  • 36. Which one should we fix this iteration? Location Next minor properties revision (7) Reporter Predictor properties (4) Work item properties (12) 7 Wednesday, 21 September, 11
  • 37. Which one should we fix this iteration? Location Next minor properties revision (7) Reporter Next major Predictor properties revision (4) Work item properties (12) 7 Wednesday, 21 September, 11
  • 38. Which one should we fix this iteration? Location Next minor properties revision (7) Reporter Next major Predictor properties revision (4) Work item Next properties version (12) 7 Wednesday, 21 September, 11
  • 39. Case study Number of <3 <1 <3 Project work months year years items Mozilla 85,616 46% 27% 27% Eclipse 63,402 76% 18% 6% 8 Wednesday, 21 September, 11
  • 40. Random Forest We use Random Forest because: • Decision tree based models are explainable comparing to SVM or neural network. • Random Forest out-performs C4.5 because it is more resistive to data with highly correlated attributes. • It is easy to analyze the sensitivity of each property. 9 Wednesday, 21 September, 11
  • 41. Random Data J48 M5P Forest Linux* 11.09 (3.09) 18.42 (2.53) 16.22(2.19) Apache* 38.64 (1.35) 34.45 (1.68) 25.39 (1.64) Jazz 17.75 (2.76) 18.43 (2.74) 11.67 (2.27) *Akinori Ihara (Kinki University, Japan) and Yasutaka Kamei (Kyushu University, Japan) 10 Wednesday, 21 September, 11
  • 42. Goals of our case study • G1: What is the accuracy of fix-time prediction model? • G2: Which properties are the most important predictors of fix-time? • G3: How applicable are models in practice? 11 Wednesday, 21 September, 11
  • 43. G1: Accuracy of the model We build 10 random forests for each project: • Each random forest uses randomly 2/3 of the data for training. • We evaluate the prediction on the rest 1/3 of the data 12 Wednesday, 21 September, 11
  • 44. Accuracy of the model: Overall misclassification 13 Wednesday, 21 September, 11
  • 45. Accuracy of the model: Overall misclassification 60 51.9 50.8 45 39.6 36.2 30 15 0 Reporter Location Description All Mozilla 13 Wednesday, 21 September, 11
  • 46. Accuracy of the model: Overall misclassification 60 50 51.9 50.8 43 45 37.5 39.6 35.8 34.7 36.2 32.8 30 25 15 12.5 0 0 Reporter Location Description All Reporter Location Description All Mozilla Eclipse 13 Wednesday, 21 September, 11
  • 47. Accuracy of the model: G1: We can correctly classify ∼65% of the time the fix-time for work items in Eclipse and Overall misclassification Mozilla, twice as good as random. 60 50 51.9 50.8 43 45 37.5 39.6 35.8 34.7 36.2 32.8 30 25 15 12.5 0 0 Reporter Location Description All Reporter Location Description All Mozilla Eclipse 13 Wednesday, 21 September, 11
  • 48. G2: Model sensitivity - Importance of each property We use technique called permutation accuracy importance measure as follow: • For each property, we randomly alter values and rerun the classification. • We give an importance score (1 to 10) depend on the change in the classification result. • We sum its score across all ten forests. 14 Wednesday, 21 September, 11
  • 49. Location Mozilla Eclipse Project fix-time Product Product opened Component work items 15 Wednesday, 21 September, 11
  • 50. Location Reporter Mozilla Eclipse Mozilla Eclipse Project fix-time Fix-time Fix-time Product Product opened Requests Overall popularity Component work items 15 Wednesday, 21 September, 11
  • 51. Location Reporter Mozilla Eclipse Mozilla Eclipse Project fix-time Fix-time Fix-time Product Product opened Requests Overall popularity Component work items Description Mozilla Eclipse Year Year Has target milestone Severity 15 Wednesday, 21 September, 11
  • 52. Location Reporter Mozilla Eclipse Mozilla Eclipse Project fix-time Fix-time Fix-time Product Product opened Requests Overall popularity Component work items Description All Mozilla Eclipse Mozilla Eclipse Year Year Year Severity Has target milestone Severity Product Number of CCed 15 Wednesday, 21 September, 11
  • 53. Location Reporter Mozilla Eclipse Mozilla Eclipse Project fix-time Fix-time Fix-time Product Product opened Requests Overall popularity Component work items G2: The time of bug filing and its location are the most important Description in the Mozilla project. In the properties All Eclipse project, bug severity is the Mozilla Eclipse Mozilla Eclipse most important property. Year Year Year Severity Has target milestone Severity Product Number of CCed 15 Wednesday, 21 September, 11
  • 54. (18%) (7%) (0%) (7 Why is time the the Resolution Type of Bu Table 5: Statistics for most important factor in Mozilla? (a) Mozilla 16 Wednesday, 21 September, 11
  • 55. ) (7%) (9%) (59%) (100%) Why is timeProjects most ype of Bugs for the Mozilla and Eclipse the important factor in Mozilla? (b) Eclipse 17 Wednesday, 21 September, 11
  • 56. G3: How applicable are models in practice? If a prediction model is stable, it should: • Use only available properties • Be stable 18 Wednesday, 21 September, 11
  • 57. Implementation plan / cause of defect Feature request / Close determined Defect report filed Triage Verify Implement Work item fix-time Wednesday, 21 September, 11
  • 58. Feature request / Defect report filed Triage Work item fix-time Wednesday, 21 September, 11
  • 59. Number of CCed? Feature request / Defect report filed Triage Work item fix-time Wednesday, 21 September, 11
  • 60. Number of CCed? Serverity change! Feature request / Defect report filed Triage Work item fix-time Wednesday, 21 September, 11
  • 61. Assigned to Number of CCed? someone else Serverity change! Feature request / Defect report filed Triage Work item fix-time Wednesday, 21 September, 11
  • 62. Accuracy of the predictor models using only available properties Data size Misclassification rate Eclipse* 86490 0.51 Linux* 2024 0.55 Jazz 16672 0.57 Apache* 1466 0.37 *Akinori Ihara (Kinki University, Japan) and Yasutaka Kamei (Kyushu University, Japan) 20 Wednesday, 21 September, 11
  • 63. Stability of the mode • Training size stability: As more data is added to the training set, the accuracy should improve. • Time stability: The accuracy should be stable overtime. 21 Wednesday, 21 September, 11
  • 64. Apache - Training size 0.70 0.65 0.60 stability Precisions 0.55 0.50 0.45 1 2 3 4 5 6 7 8 9 x 10% of training data 22 Wednesday, 21 September, 11
  • 65. Apache - Time stability Precisions 0.42 0.40 0.38 0.36 0.34 1 2 3 4 5 6 7 8 x 10% of training data 23 Wednesday, 21 September, 11
  • 66. Apache - Time stability G3: Fix-time prediction model may 0.42 work on project such as Apache in 0.40 practice. Apache prediction model Precisions have data stability and time 0.38 stability. 0.36 0.34 1 2 3 4 5 6 7 8 x 10% of training data 23 Wednesday, 21 September, 11