SlideShare une entreprise Scribd logo
1  sur  28
Télécharger pour lire hors ligne
A Bayesian Approach for
   the Detection of Code
     and Design Smells

    Foutse Khomh, Stéphane Vaucher,
Yann-Gaël Guéhéneuc, and Houari Sahraoui
                                 QSIC’09
                                  Jeju-do
                                2009/08/25

   Ptidej Team – Pattern-based Quality Evaluation and Enhancement
   Department of Computer Engineering and Software Engineering
   École Polytechnique de Montréal, Québec, Canada                  © Khomh, 2009
Outline

        Motivation
        Context
        Problems and Objectives
        Previous Work
        Our Solution
        Bayesian Belief Networks
        Experiments
        Discussions
2/27    Conclusion
Motivation

        Independent IV&V team reviews of
        object-oriented software programs
        – Assess the quality of programs
        – Identify potential problems
           • Defects
           • Anomalies
           • “Bad Smells”



3/27
Context

        Code smells and antipatterns

        Blob: also called God class [23], is a
        class that centralises functionality and
        has too many responsibilities. Brown et
        al. [4] characterise its structure as a large
        controller class that depends on data
        stored in several surrounding data
4/27    classes
Problems and Objectives

        Problem 1: Sizes of programs can be large
        Automate the evaluation process

        Problem 2: Resources are limited
        Prioritise manual reviews

        Problem 3: Quality assessment is subjective
        Consider uncertainty in the process
5/27
Previous Work                     (1/2)

        Webster’s book on “antipatterns” [31]
        Riel’s 61 heuristics [23]
        Beck’s 22 code smells [11]
        Brown et al. 40 antipatterns [4]
        Travassos et al. manual approach [28]
        Marinescu’s detection strategies [17]
        (also Munro [19])
6/27
        Moha et al. rules cards in DECOR [18]
Previous Work                        (2/2)

        Either no full automation
        – Problem 1 is not solved


        Or strict rules with classes participating
        or not (binary value) to an antipattern
        – Problem 2 and 3 are not solved



7/27
Our Solution                     (1/2)

        Assign a probability that a class is
        considered an antipattern by a
        stakeholder
        Guide a manual inspection according to
        this probability
        Use Bayesian belief network to build a
        model of an antipattern and assign a
        probability to all classes
8/27
Our Solution                            (2/2)

        Problem 1: Sizes of programs can be large
        Assign a probability to all classes of interest
        automatically

        Problem 2: Resources are limited
        Prioritise manual reviews using probabilities

        Problem 3: Quality assessment is subjective
        Uncertainty is naturally considered using
9/27    Bayesian beliefs networks
BBN                                     (1/3)

        A Bayesian Belief Network is a directed
        acyclic graph with probability distribution
        Graph structure
         – Nodes = random variables
         – Edges = probabilities dependencies
        Each node depends only on its parents

        Embody the experts’ knowledge
10/27
BBN                               (2/3)

        Classifier
         – C = {smell, not smell}
        Input vector describing a class
         – <a1, …, an>
         – P(A|B) = P(B|A) P(A) / P(B)




11/27
BBN                                       (2/3)

        Building a BBN
         1. Define its structure
         2. Assign/learn its probability tables




12/27
1. Defining the BBN Structure

         Based on Moha et al.’s rule cards
         – Detection relies on metrics, binary class
           relations, and linguistic analysis
         – Values are combined using set operators
         – Thresholds are statistically defined
           (using a box plot)



13/27
Rule Card of Blob
RULE_CARD : Blob {
   RULE : Blob { ASSOC : a s s o c i a t e d FROM : m ai n Cl a ss ONE TO : D a t a C l a s s MANY }
    ;
   RULE : MainClass { UNION LargeCla ssLowCohesion C o n t r o l l e r C l a s s } ;
   RULE : LargeCla ssLo wCohesion { UNION L a r g e C l a s s LowCohesion } ;
   RULE : L a r g e C l a s s { ( METRIC : NMD + NAD, VERY_HIGH , 0 ) } ;
   RULE : LowCohesion { ( METRIC : LCOM5, VERY_HIGH , 2 0 ) } ;
   RULE : C o n t r o l l e r C l a s s { UNION
       ( SEMANTIC : METHODNAME , { P r o c e s s , C o n t r o l , C t r l , Command , Cmd,
                                               Proc, UI, Manage, D r i v e } )
       ( SEMANTIC : CLASSNAME , { P r o c e s s , C o n t r o l , C t r l , Command , Cmd,
                                             Proc , UI, Manage, Drive , System, Subsystem } ) } ;
   RULE : D a t a C l a s s { ( STRUCT : METHOD_ACCESSOR , 9 0 ) } ;
};

  14/27
Conversion into BBN

         Set operators (OR, AND) are learned
         using conditional probabilities
         Hard thresholds and rules are smoothed
         by interpolating values




         Binary class relations are counted and
15/27
         treated as metrics
BBN Structure for the Blob




16/27
2. Resulting BBN for the Blob




17/27
Experiments

         RQ1: To what extent a model built with
         a BBN based on an existing rule-based
         model is able detect smells in a
         program?

         RQ2: Is a model built with a BBN better
         than a state-of-the-art approach,
         DECOR?
18/27
Settings

         Programs
          – Xerces v2.7.0
          – Gantt Project v1.10.2


         Oracle: Vote of three groups
         (of students) on both programs


19/27
Settings

         Calibration (probability tables)
          – Local (same-system calibration)
          – External (calibrated on one system,
            applied to the other)




20/27
Local calibration

           On Xerces only
           Three-fold cross validation
            – 3 groups with 5 Blobs in each
            – Combination of 2 groups for calibration,
              tested on the other
                             Inspected Classes

        Group1           6                       Waste of 8% to 44%
        Group2           7
                                                 of effort (average 32%)
21/27   Group3           9
External calibration

         Probabilities learned on one, applied on other
         Precision vs. Recall (per investigation sizes)
         Xerces (left), Gantt (right)




22/27
Experiments

         RQ1: To what extent a model built with
         a BBN based on an existing rule-based
         model is able detect smells in a
         program?
         RQ2: Is a model built with a BBN better
         than a state-of-the-art approach,
         DECOR?
         Better than DECOR
23/27
Discussions                        (1/2)

         Results of BBN are superior to those of
         DECOR
         External calibration yields good results
         even though tweaking is recommended
         A well calibrated model should be able
         to estimate the number of smells in a
         program, but our models overestimate
         by a factor of 2
24/27
Discussions                        (2/2)

         Used simple techniques to convert rules
         into probability distributions
         Worked with discrete probabilities, might
         need to be adapted to continuous cases




25/27
Conclusion

         Bayesian models support uncertainty in
         detection process
         Their performance is better than state-
         of-the-art rules
         External calibration seems to indicate
         that this could be used in an industrial
         context with minimal costs
         Support for automated conversion of
26/27
         rules to support more design smells
Acknowledgements

         Naouel Moha for providing her data and
         code to compare with DECOR

         Natural Sciences and Engineering
         Research Council of Canada
         Canada Research Chair in Software
         Patterns and Patterns of Software

27/27
Advertisement

         IEEE Software
         Special issue on Software Evolution

         Deadline: 1st of November, 2009
         Publication: July/August 2010

         Please do not hesitate to contact me! ☺
28/27

Contenu connexe

Similaire à QSIC09.ppt

A defect prediction model based on the relationships between developers and c...
A defect prediction model based on the relationships between developers and c...A defect prediction model based on the relationships between developers and c...
A defect prediction model based on the relationships between developers and c...Vrije Universiteit Brussel
 
Revisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software TestingRevisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software TestingLionel Briand
 
Types of Machine Learnig Algorithms(CART, ID3)
Types of Machine Learnig Algorithms(CART, ID3)Types of Machine Learnig Algorithms(CART, ID3)
Types of Machine Learnig Algorithms(CART, ID3)Fatimakhan325
 
Optimizing Market Segmentation
Optimizing Market SegmentationOptimizing Market Segmentation
Optimizing Market SegmentationRobert Colner
 
Low rank models for recommender systems with limited preference information
Low rank models for recommender systems with limited preference informationLow rank models for recommender systems with limited preference information
Low rank models for recommender systems with limited preference informationEvgeny Frolov
 
2cee Master Cocomo20071
2cee Master Cocomo200712cee Master Cocomo20071
2cee Master Cocomo20071CS, NcState
 
ECOOP05 QAOOSEb.ppt
ECOOP05 QAOOSEb.pptECOOP05 QAOOSEb.ppt
ECOOP05 QAOOSEb.pptPtidej Team
 
Avoiding Software Insanity
Avoiding Software InsanityAvoiding Software Insanity
Avoiding Software Insanityjosephnaveen
 
Apresentação no 4o Workshop de Sistemas Distribuídos Autonômicos - WoSiDA 201...
Apresentação no 4o Workshop de Sistemas Distribuídos Autonômicos - WoSiDA 201...Apresentação no 4o Workshop de Sistemas Distribuídos Autonômicos - WoSiDA 201...
Apresentação no 4o Workshop de Sistemas Distribuídos Autonômicos - WoSiDA 201...Sandro Andrade
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)ijceronline
 
Thesis Defense: Integration of Modeling Methods for Cyber-Physical Systems
Thesis Defense: Integration of Modeling Methods for Cyber-Physical SystemsThesis Defense: Integration of Modeling Methods for Cyber-Physical Systems
Thesis Defense: Integration of Modeling Methods for Cyber-Physical SystemsIvan Ruchkin
 
Acm Tech Talk - Decomposition Paradigms for Large Scale Systems
Acm Tech Talk - Decomposition Paradigms for Large Scale SystemsAcm Tech Talk - Decomposition Paradigms for Large Scale Systems
Acm Tech Talk - Decomposition Paradigms for Large Scale SystemsVinayak Hegde
 
Enlister baidu's recommender system for the biggest chinese q&a website
Enlister baidu's recommender system for the biggest chinese q&a websiteEnlister baidu's recommender system for the biggest chinese q&a website
Enlister baidu's recommender system for the biggest chinese q&a websitejasonfuoo
 
An enhanced kernel weighted collaborative recommended system to alleviate spa...
An enhanced kernel weighted collaborative recommended system to alleviate spa...An enhanced kernel weighted collaborative recommended system to alleviate spa...
An enhanced kernel weighted collaborative recommended system to alleviate spa...IJECEIAES
 
Validation of Design Tools-PPT for CEDAR Meeting-04-15-2016
Validation of Design Tools-PPT for CEDAR Meeting-04-15-2016Validation of Design Tools-PPT for CEDAR Meeting-04-15-2016
Validation of Design Tools-PPT for CEDAR Meeting-04-15-2016Varun Singh
 

Similaire à QSIC09.ppt (20)

Qsic09.ppt
Qsic09.pptQsic09.ppt
Qsic09.ppt
 
A defect prediction model based on the relationships between developers and c...
A defect prediction model based on the relationships between developers and c...A defect prediction model based on the relationships between developers and c...
A defect prediction model based on the relationships between developers and c...
 
Revisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software TestingRevisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software Testing
 
Types of Machine Learnig Algorithms(CART, ID3)
Types of Machine Learnig Algorithms(CART, ID3)Types of Machine Learnig Algorithms(CART, ID3)
Types of Machine Learnig Algorithms(CART, ID3)
 
Optimizing Market Segmentation
Optimizing Market SegmentationOptimizing Market Segmentation
Optimizing Market Segmentation
 
Low rank models for recommender systems with limited preference information
Low rank models for recommender systems with limited preference informationLow rank models for recommender systems with limited preference information
Low rank models for recommender systems with limited preference information
 
2cee Master Cocomo20071
2cee Master Cocomo200712cee Master Cocomo20071
2cee Master Cocomo20071
 
VISSOFTPresentation.pdf
VISSOFTPresentation.pdfVISSOFTPresentation.pdf
VISSOFTPresentation.pdf
 
MexADL
MexADLMexADL
MexADL
 
ECOOP05 QAOOSEb.ppt
ECOOP05 QAOOSEb.pptECOOP05 QAOOSEb.ppt
ECOOP05 QAOOSEb.ppt
 
Avoiding Software Insanity
Avoiding Software InsanityAvoiding Software Insanity
Avoiding Software Insanity
 
Apresentação no 4o Workshop de Sistemas Distribuídos Autonômicos - WoSiDA 201...
Apresentação no 4o Workshop de Sistemas Distribuídos Autonômicos - WoSiDA 201...Apresentação no 4o Workshop de Sistemas Distribuídos Autonômicos - WoSiDA 201...
Apresentação no 4o Workshop de Sistemas Distribuídos Autonômicos - WoSiDA 201...
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 
Thesis Defense: Integration of Modeling Methods for Cyber-Physical Systems
Thesis Defense: Integration of Modeling Methods for Cyber-Physical SystemsThesis Defense: Integration of Modeling Methods for Cyber-Physical Systems
Thesis Defense: Integration of Modeling Methods for Cyber-Physical Systems
 
Unit 5 Quantization
Unit 5 QuantizationUnit 5 Quantization
Unit 5 Quantization
 
Acm Tech Talk - Decomposition Paradigms for Large Scale Systems
Acm Tech Talk - Decomposition Paradigms for Large Scale SystemsAcm Tech Talk - Decomposition Paradigms for Large Scale Systems
Acm Tech Talk - Decomposition Paradigms for Large Scale Systems
 
Enlister baidu's recommender system for the biggest chinese q&a website
Enlister baidu's recommender system for the biggest chinese q&a websiteEnlister baidu's recommender system for the biggest chinese q&a website
Enlister baidu's recommender system for the biggest chinese q&a website
 
An enhanced kernel weighted collaborative recommended system to alleviate spa...
An enhanced kernel weighted collaborative recommended system to alleviate spa...An enhanced kernel weighted collaborative recommended system to alleviate spa...
An enhanced kernel weighted collaborative recommended system to alleviate spa...
 
COMPSAC 2008 Presentation
COMPSAC 2008 PresentationCOMPSAC 2008 Presentation
COMPSAC 2008 Presentation
 
Validation of Design Tools-PPT for CEDAR Meeting-04-15-2016
Validation of Design Tools-PPT for CEDAR Meeting-04-15-2016Validation of Design Tools-PPT for CEDAR Meeting-04-15-2016
Validation of Design Tools-PPT for CEDAR Meeting-04-15-2016
 

Plus de Ptidej Team

From IoT to Software Miniaturisation
From IoT to Software MiniaturisationFrom IoT to Software Miniaturisation
From IoT to Software MiniaturisationPtidej Team
 
Presentation by Lionel Briand
Presentation by Lionel BriandPresentation by Lionel Briand
Presentation by Lionel BriandPtidej Team
 
Manel Abdellatif
Manel AbdellatifManel Abdellatif
Manel AbdellatifPtidej Team
 
Azadeh Kermansaravi
Azadeh KermansaraviAzadeh Kermansaravi
Azadeh KermansaraviPtidej Team
 
CSED - Manel Grichi
CSED - Manel GrichiCSED - Manel Grichi
CSED - Manel GrichiPtidej Team
 
Cristiano Politowski
Cristiano PolitowskiCristiano Politowski
Cristiano PolitowskiPtidej Team
 
Will io t trigger the next software crisis
Will io t trigger the next software crisisWill io t trigger the next software crisis
Will io t trigger the next software crisisPtidej Team
 
Thesis+of+laleh+eshkevari.ppt
Thesis+of+laleh+eshkevari.pptThesis+of+laleh+eshkevari.ppt
Thesis+of+laleh+eshkevari.pptPtidej Team
 
Thesis+of+nesrine+abdelkafi.ppt
Thesis+of+nesrine+abdelkafi.pptThesis+of+nesrine+abdelkafi.ppt
Thesis+of+nesrine+abdelkafi.pptPtidej Team
 

Plus de Ptidej Team (20)

From IoT to Software Miniaturisation
From IoT to Software MiniaturisationFrom IoT to Software Miniaturisation
From IoT to Software Miniaturisation
 
Presentation
PresentationPresentation
Presentation
 
Presentation
PresentationPresentation
Presentation
 
Presentation
PresentationPresentation
Presentation
 
Presentation by Lionel Briand
Presentation by Lionel BriandPresentation by Lionel Briand
Presentation by Lionel Briand
 
Manel Abdellatif
Manel AbdellatifManel Abdellatif
Manel Abdellatif
 
Azadeh Kermansaravi
Azadeh KermansaraviAzadeh Kermansaravi
Azadeh Kermansaravi
 
Mouna Abidi
Mouna AbidiMouna Abidi
Mouna Abidi
 
CSED - Manel Grichi
CSED - Manel GrichiCSED - Manel Grichi
CSED - Manel Grichi
 
Cristiano Politowski
Cristiano PolitowskiCristiano Politowski
Cristiano Politowski
 
Will io t trigger the next software crisis
Will io t trigger the next software crisisWill io t trigger the next software crisis
Will io t trigger the next software crisis
 
MIPA
MIPAMIPA
MIPA
 
Thesis+of+laleh+eshkevari.ppt
Thesis+of+laleh+eshkevari.pptThesis+of+laleh+eshkevari.ppt
Thesis+of+laleh+eshkevari.ppt
 
Thesis+of+nesrine+abdelkafi.ppt
Thesis+of+nesrine+abdelkafi.pptThesis+of+nesrine+abdelkafi.ppt
Thesis+of+nesrine+abdelkafi.ppt
 
Medicine15.ppt
Medicine15.pptMedicine15.ppt
Medicine15.ppt
 
Qrs17b.ppt
Qrs17b.pptQrs17b.ppt
Qrs17b.ppt
 
Icpc11c.ppt
Icpc11c.pptIcpc11c.ppt
Icpc11c.ppt
 
Icsme16.ppt
Icsme16.pptIcsme16.ppt
Icsme16.ppt
 
Msr17a.ppt
Msr17a.pptMsr17a.ppt
Msr17a.ppt
 
Icsoc15.ppt
Icsoc15.pptIcsoc15.ppt
Icsoc15.ppt
 

Dernier

SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 

Dernier (20)

SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 

QSIC09.ppt

  • 1. A Bayesian Approach for the Detection of Code and Design Smells Foutse Khomh, Stéphane Vaucher, Yann-Gaël Guéhéneuc, and Houari Sahraoui QSIC’09 Jeju-do 2009/08/25 Ptidej Team – Pattern-based Quality Evaluation and Enhancement Department of Computer Engineering and Software Engineering École Polytechnique de Montréal, Québec, Canada © Khomh, 2009
  • 2. Outline Motivation Context Problems and Objectives Previous Work Our Solution Bayesian Belief Networks Experiments Discussions 2/27 Conclusion
  • 3. Motivation Independent IV&V team reviews of object-oriented software programs – Assess the quality of programs – Identify potential problems • Defects • Anomalies • “Bad Smells” 3/27
  • 4. Context Code smells and antipatterns Blob: also called God class [23], is a class that centralises functionality and has too many responsibilities. Brown et al. [4] characterise its structure as a large controller class that depends on data stored in several surrounding data 4/27 classes
  • 5. Problems and Objectives Problem 1: Sizes of programs can be large Automate the evaluation process Problem 2: Resources are limited Prioritise manual reviews Problem 3: Quality assessment is subjective Consider uncertainty in the process 5/27
  • 6. Previous Work (1/2) Webster’s book on “antipatterns” [31] Riel’s 61 heuristics [23] Beck’s 22 code smells [11] Brown et al. 40 antipatterns [4] Travassos et al. manual approach [28] Marinescu’s detection strategies [17] (also Munro [19]) 6/27 Moha et al. rules cards in DECOR [18]
  • 7. Previous Work (2/2) Either no full automation – Problem 1 is not solved Or strict rules with classes participating or not (binary value) to an antipattern – Problem 2 and 3 are not solved 7/27
  • 8. Our Solution (1/2) Assign a probability that a class is considered an antipattern by a stakeholder Guide a manual inspection according to this probability Use Bayesian belief network to build a model of an antipattern and assign a probability to all classes 8/27
  • 9. Our Solution (2/2) Problem 1: Sizes of programs can be large Assign a probability to all classes of interest automatically Problem 2: Resources are limited Prioritise manual reviews using probabilities Problem 3: Quality assessment is subjective Uncertainty is naturally considered using 9/27 Bayesian beliefs networks
  • 10. BBN (1/3) A Bayesian Belief Network is a directed acyclic graph with probability distribution Graph structure – Nodes = random variables – Edges = probabilities dependencies Each node depends only on its parents Embody the experts’ knowledge 10/27
  • 11. BBN (2/3) Classifier – C = {smell, not smell} Input vector describing a class – <a1, …, an> – P(A|B) = P(B|A) P(A) / P(B) 11/27
  • 12. BBN (2/3) Building a BBN 1. Define its structure 2. Assign/learn its probability tables 12/27
  • 13. 1. Defining the BBN Structure Based on Moha et al.’s rule cards – Detection relies on metrics, binary class relations, and linguistic analysis – Values are combined using set operators – Thresholds are statistically defined (using a box plot) 13/27
  • 14. Rule Card of Blob RULE_CARD : Blob { RULE : Blob { ASSOC : a s s o c i a t e d FROM : m ai n Cl a ss ONE TO : D a t a C l a s s MANY } ; RULE : MainClass { UNION LargeCla ssLowCohesion C o n t r o l l e r C l a s s } ; RULE : LargeCla ssLo wCohesion { UNION L a r g e C l a s s LowCohesion } ; RULE : L a r g e C l a s s { ( METRIC : NMD + NAD, VERY_HIGH , 0 ) } ; RULE : LowCohesion { ( METRIC : LCOM5, VERY_HIGH , 2 0 ) } ; RULE : C o n t r o l l e r C l a s s { UNION ( SEMANTIC : METHODNAME , { P r o c e s s , C o n t r o l , C t r l , Command , Cmd, Proc, UI, Manage, D r i v e } ) ( SEMANTIC : CLASSNAME , { P r o c e s s , C o n t r o l , C t r l , Command , Cmd, Proc , UI, Manage, Drive , System, Subsystem } ) } ; RULE : D a t a C l a s s { ( STRUCT : METHOD_ACCESSOR , 9 0 ) } ; }; 14/27
  • 15. Conversion into BBN Set operators (OR, AND) are learned using conditional probabilities Hard thresholds and rules are smoothed by interpolating values Binary class relations are counted and 15/27 treated as metrics
  • 16. BBN Structure for the Blob 16/27
  • 17. 2. Resulting BBN for the Blob 17/27
  • 18. Experiments RQ1: To what extent a model built with a BBN based on an existing rule-based model is able detect smells in a program? RQ2: Is a model built with a BBN better than a state-of-the-art approach, DECOR? 18/27
  • 19. Settings Programs – Xerces v2.7.0 – Gantt Project v1.10.2 Oracle: Vote of three groups (of students) on both programs 19/27
  • 20. Settings Calibration (probability tables) – Local (same-system calibration) – External (calibrated on one system, applied to the other) 20/27
  • 21. Local calibration On Xerces only Three-fold cross validation – 3 groups with 5 Blobs in each – Combination of 2 groups for calibration, tested on the other Inspected Classes Group1 6 Waste of 8% to 44% Group2 7 of effort (average 32%) 21/27 Group3 9
  • 22. External calibration Probabilities learned on one, applied on other Precision vs. Recall (per investigation sizes) Xerces (left), Gantt (right) 22/27
  • 23. Experiments RQ1: To what extent a model built with a BBN based on an existing rule-based model is able detect smells in a program? RQ2: Is a model built with a BBN better than a state-of-the-art approach, DECOR? Better than DECOR 23/27
  • 24. Discussions (1/2) Results of BBN are superior to those of DECOR External calibration yields good results even though tweaking is recommended A well calibrated model should be able to estimate the number of smells in a program, but our models overestimate by a factor of 2 24/27
  • 25. Discussions (2/2) Used simple techniques to convert rules into probability distributions Worked with discrete probabilities, might need to be adapted to continuous cases 25/27
  • 26. Conclusion Bayesian models support uncertainty in detection process Their performance is better than state- of-the-art rules External calibration seems to indicate that this could be used in an industrial context with minimal costs Support for automated conversion of 26/27 rules to support more design smells
  • 27. Acknowledgements Naouel Moha for providing her data and code to compare with DECOR Natural Sciences and Engineering Research Council of Canada Canada Research Chair in Software Patterns and Patterns of Software 27/27
  • 28. Advertisement IEEE Software Special issue on Software Evolution Deadline: 1st of November, 2009 Publication: July/August 2010 Please do not hesitate to contact me! ☺ 28/27