SlideShare une entreprise Scribd logo
1  sur  7
Télécharger pour lire hors ligne
Mining Java Class Naming Conventions

           Simon Butler, Michel Wermelinger, Yijun Yu & Helen Sharp

                                      Centre for Research in Computing
                                            The Open University


                                          27 September 2011




           Centre for
           Research in Computing                                 m.a.wermelinger@open.ac.uk


Butler et al. (The Open University)     Mining Java Class Naming Conventions   27 September 2011   1/7
Class Identifier Names

         Despite the importance of
         class identifier names                                AbstractCollection           Set
         knowledge of their structure
         is limited
          adjective ∗ noun +
         approximation found to be                                           AbstractSet
         useful, but not universal
         What other part-of-speech
         patterns are commonly used?
         How are component words
                                                             EnumSet          HashSet         TreeSet
         repeated? How often?
         Are there project-specific
         naming conventions?


Butler et al. (The Open University)   Mining Java Class Naming Conventions          27 September 2011   2/7
Distribution of Java Classes in Inheritance Categories



                                                                     0.7
                                                                     0.6
                  Proportion of inheritance categories per project

                                                                     0.5
                                                                     0.4
                                                                     0.3
                                                                     0.2
                                                                     0.1
                                                                     0.0




                                                                           E0I0       E0I1       E0In       E1I0         E1I1   E1In



Butler et al. (The Open University)                                               Mining Java Class Naming Conventions             27 September 2011   3/7
Part-of-Speech Patterns
                Relative frequency of most common PoS patterns
                                             noun +
                                adjective +              verb +
                     noun +           +      adjective +
                                noun                     noun +
                                             noun +
             E0 I 0             0.85                 0.08                     0.01        0.01
             E0 I 1             0.73                 0.15                     0.02        0.02
             E0 I n             0.75                 0.15                     0.03        0.01
             E1 I 0             0.68                 0.12                     0.04        0.03
             E1 I 1             0.70                 0.15                     0.04        0.02
             E1 I n             0.75                 0.14                     0.04        0.02

       4 basic patterns account for 90% of class identifier names
       85% of E0 I0 class identifier names are composed of nouns
       The adjective ∗ noun + approximation includes 85% of class
       identifier names
Butler et al. (The Open University)    Mining Java Class Naming Conventions          27 September 2011   4/7
Component Word Inheritance
              Relative frequency distribution of name inheritance
                               Super Class Name                Interface Name
           Category             All    Fragment                 All Fragment        Both
           E0 I1                  -                    -       0.39          0.37        -
           E0 In                  -                    -       0.38          0.40        -
           E1 I0               0.23                 0.58          -             -        -
           E1 I1               0.14                 0.53       0.24          0.21     0.27
           E1 In               0.11                 0.50       0.15          0.25     0.18


       Fragments of super class name most commonly repeated
       Most common patterns:
               E0 I1 & E0 I1 : noun + interface name , noun + interface fragment
               E1 I0 : noun + super class fragment , noun + super class name
               E1 I1 & E1 In : noun + super class fragment ,
                interface name super class fragment , noun + super class name
Butler et al. (The Open University)   Mining Java Class Naming Conventions     27 September 2011   5/7
Case Study - Freemind

       652 class identifier names
       53 (8%) with uncommon PoS patterns
       Each class inspected with questions:
          1. Is the class identifier name a clear description of the class?
          2. Can the class identifier name be refactored to a more common PoS
             pattern?
          3. Can the class be refactored into classes that could be more
             conventionally named?
       We found:
               Class identifier names describing GUI actions initiated by the user, e.g.
               SelectAllAction ( verb determiner noun )
               Class identifier names that conform to local naming conventions
               7 class identifier names were candidates for name refactoring
               1 class was a candidate for refactoring



Butler et al. (The Open University)   Mining Java Class Naming Conventions   27 September 2011   6/7
Conclusions


       Contributions
               Identification of common PoS structures found in praxis
               Identification of common patterns of component word repetition
               Unconventional class names:
                       may conform to local naming conventions
                       may be candidates for refactoring
                       may indicate smells

       Practical Applications
               Recovery of class naming conventions
               Identification of unconventionally named classes
               Class identifier name recommendation systems




Butler et al. (The Open University)   Mining Java Class Naming Conventions   27 September 2011   7/7

Contenu connexe

En vedette

ERA Poster - Measuring Disruption from Software Evolution Activities Using Gr...
ERA Poster - Measuring Disruption from Software Evolution Activities Using Gr...ERA Poster - Measuring Disruption from Software Evolution Activities Using Gr...
ERA Poster - Measuring Disruption from Software Evolution Activities Using Gr...ICSM 2011
 
Components - Graph Based Detection of Library API Limitations
Components - Graph Based Detection of Library API LimitationsComponents - Graph Based Detection of Library API Limitations
Components - Graph Based Detection of Library API LimitationsICSM 2011
 
ERA - Measuring Maintainability of Spreadsheets in the Wild
ERA - Measuring Maintainability of Spreadsheets in the Wild ERA - Measuring Maintainability of Spreadsheets in the Wild
ERA - Measuring Maintainability of Spreadsheets in the Wild ICSM 2011
 
ERA - Measuring Disruption from Software Evolution Activities Using Graph-Bas...
ERA - Measuring Disruption from Software Evolution Activities Using Graph-Bas...ERA - Measuring Disruption from Software Evolution Activities Using Graph-Bas...
ERA - Measuring Disruption from Software Evolution Activities Using Graph-Bas...ICSM 2011
 
Faults and Regression Testing - Fault interaction and its repercussions
Faults and Regression Testing - Fault interaction and its repercussionsFaults and Regression Testing - Fault interaction and its repercussions
Faults and Regression Testing - Fault interaction and its repercussionsICSM 2011
 
Components - Crossing the Boundaries while Analyzing Heterogeneous Component-...
Components - Crossing the Boundaries while Analyzing Heterogeneous Component-...Components - Crossing the Boundaries while Analyzing Heterogeneous Component-...
Components - Crossing the Boundaries while Analyzing Heterogeneous Component-...ICSM 2011
 
Industry - The Evolution of Information Systems. A Case Study on Document Man...
Industry - The Evolution of Information Systems. A Case Study on Document Man...Industry - The Evolution of Information Systems. A Case Study on Document Man...
Industry - The Evolution of Information Systems. A Case Study on Document Man...ICSM 2011
 
Impact Analysis - ImpactScale: Quantifying Change Impact to Predict Faults in...
Impact Analysis - ImpactScale: Quantifying Change Impact to Predict Faults in...Impact Analysis - ImpactScale: Quantifying Change Impact to Predict Faults in...
Impact Analysis - ImpactScale: Quantifying Change Impact to Predict Faults in...ICSM 2011
 
Industry - Relating Developers' Concepts and Artefact Vocabulary in a Financ...
Industry -  Relating Developers' Concepts and Artefact Vocabulary in a Financ...Industry -  Relating Developers' Concepts and Artefact Vocabulary in a Financ...
Industry - Relating Developers' Concepts and Artefact Vocabulary in a Financ...ICSM 2011
 
ERA - Clustering and Recommending Collections of Code Relevant to Task
ERA - Clustering and Recommending Collections of Code Relevant to TaskERA - Clustering and Recommending Collections of Code Relevant to Task
ERA - Clustering and Recommending Collections of Code Relevant to TaskICSM 2011
 
Impact analysis - A Seismology-inspired Approach to Study Change Propagation
Impact analysis - A Seismology-inspired Approach to Study Change PropagationImpact analysis - A Seismology-inspired Approach to Study Change Propagation
Impact analysis - A Seismology-inspired Approach to Study Change PropagationICSM 2011
 
Postdoc Symposium - Abram Hindle
Postdoc Symposium - Abram HindlePostdoc Symposium - Abram Hindle
Postdoc Symposium - Abram HindleICSM 2011
 
Industry - Estimating software maintenance effort from use cases an indu...
Industry - Estimating software maintenance effort from use cases an      indu...Industry - Estimating software maintenance effort from use cases an      indu...
Industry - Estimating software maintenance effort from use cases an indu...ICSM 2011
 
Industry - Precise Detection of Un-Initialized Variables in Large, Real-life ...
Industry - Precise Detection of Un-Initialized Variables in Large, Real-life ...Industry - Precise Detection of Un-Initialized Variables in Large, Real-life ...
Industry - Precise Detection of Un-Initialized Variables in Large, Real-life ...ICSM 2011
 
Tutorial 2 - Practical Combinatorial (t-way) Methods for Detecting Complex Fa...
Tutorial 2 - Practical Combinatorial (t-way) Methods for Detecting Complex Fa...Tutorial 2 - Practical Combinatorial (t-way) Methods for Detecting Complex Fa...
Tutorial 2 - Practical Combinatorial (t-way) Methods for Detecting Complex Fa...ICSM 2011
 
Traceability - Structural Conformance Checking with Design Tests: An Evaluati...
Traceability - Structural Conformance Checking with Design Tests: An Evaluati...Traceability - Structural Conformance Checking with Design Tests: An Evaluati...
Traceability - Structural Conformance Checking with Design Tests: An Evaluati...ICSM 2011
 
Natural Language Analysis - Expanding Identifiers to Normalize Source Code Vo...
Natural Language Analysis - Expanding Identifiers to Normalize Source Code Vo...Natural Language Analysis - Expanding Identifiers to Normalize Source Code Vo...
Natural Language Analysis - Expanding Identifiers to Normalize Source Code Vo...ICSM 2011
 
Reliability and Quality - Predicting post-release defects using pre-release f...
Reliability and Quality - Predicting post-release defects using pre-release f...Reliability and Quality - Predicting post-release defects using pre-release f...
Reliability and Quality - Predicting post-release defects using pre-release f...ICSM 2011
 
Faults and Regression testing - Localizing Failure-Inducing Program Edits Bas...
Faults and Regression testing - Localizing Failure-Inducing Program Edits Bas...Faults and Regression testing - Localizing Failure-Inducing Program Edits Bas...
Faults and Regression testing - Localizing Failure-Inducing Program Edits Bas...ICSM 2011
 
ERA - Tracking Technical Debt
ERA - Tracking Technical DebtERA - Tracking Technical Debt
ERA - Tracking Technical DebtICSM 2011
 

En vedette (20)

ERA Poster - Measuring Disruption from Software Evolution Activities Using Gr...
ERA Poster - Measuring Disruption from Software Evolution Activities Using Gr...ERA Poster - Measuring Disruption from Software Evolution Activities Using Gr...
ERA Poster - Measuring Disruption from Software Evolution Activities Using Gr...
 
Components - Graph Based Detection of Library API Limitations
Components - Graph Based Detection of Library API LimitationsComponents - Graph Based Detection of Library API Limitations
Components - Graph Based Detection of Library API Limitations
 
ERA - Measuring Maintainability of Spreadsheets in the Wild
ERA - Measuring Maintainability of Spreadsheets in the Wild ERA - Measuring Maintainability of Spreadsheets in the Wild
ERA - Measuring Maintainability of Spreadsheets in the Wild
 
ERA - Measuring Disruption from Software Evolution Activities Using Graph-Bas...
ERA - Measuring Disruption from Software Evolution Activities Using Graph-Bas...ERA - Measuring Disruption from Software Evolution Activities Using Graph-Bas...
ERA - Measuring Disruption from Software Evolution Activities Using Graph-Bas...
 
Faults and Regression Testing - Fault interaction and its repercussions
Faults and Regression Testing - Fault interaction and its repercussionsFaults and Regression Testing - Fault interaction and its repercussions
Faults and Regression Testing - Fault interaction and its repercussions
 
Components - Crossing the Boundaries while Analyzing Heterogeneous Component-...
Components - Crossing the Boundaries while Analyzing Heterogeneous Component-...Components - Crossing the Boundaries while Analyzing Heterogeneous Component-...
Components - Crossing the Boundaries while Analyzing Heterogeneous Component-...
 
Industry - The Evolution of Information Systems. A Case Study on Document Man...
Industry - The Evolution of Information Systems. A Case Study on Document Man...Industry - The Evolution of Information Systems. A Case Study on Document Man...
Industry - The Evolution of Information Systems. A Case Study on Document Man...
 
Impact Analysis - ImpactScale: Quantifying Change Impact to Predict Faults in...
Impact Analysis - ImpactScale: Quantifying Change Impact to Predict Faults in...Impact Analysis - ImpactScale: Quantifying Change Impact to Predict Faults in...
Impact Analysis - ImpactScale: Quantifying Change Impact to Predict Faults in...
 
Industry - Relating Developers' Concepts and Artefact Vocabulary in a Financ...
Industry -  Relating Developers' Concepts and Artefact Vocabulary in a Financ...Industry -  Relating Developers' Concepts and Artefact Vocabulary in a Financ...
Industry - Relating Developers' Concepts and Artefact Vocabulary in a Financ...
 
ERA - Clustering and Recommending Collections of Code Relevant to Task
ERA - Clustering and Recommending Collections of Code Relevant to TaskERA - Clustering and Recommending Collections of Code Relevant to Task
ERA - Clustering and Recommending Collections of Code Relevant to Task
 
Impact analysis - A Seismology-inspired Approach to Study Change Propagation
Impact analysis - A Seismology-inspired Approach to Study Change PropagationImpact analysis - A Seismology-inspired Approach to Study Change Propagation
Impact analysis - A Seismology-inspired Approach to Study Change Propagation
 
Postdoc Symposium - Abram Hindle
Postdoc Symposium - Abram HindlePostdoc Symposium - Abram Hindle
Postdoc Symposium - Abram Hindle
 
Industry - Estimating software maintenance effort from use cases an indu...
Industry - Estimating software maintenance effort from use cases an      indu...Industry - Estimating software maintenance effort from use cases an      indu...
Industry - Estimating software maintenance effort from use cases an indu...
 
Industry - Precise Detection of Un-Initialized Variables in Large, Real-life ...
Industry - Precise Detection of Un-Initialized Variables in Large, Real-life ...Industry - Precise Detection of Un-Initialized Variables in Large, Real-life ...
Industry - Precise Detection of Un-Initialized Variables in Large, Real-life ...
 
Tutorial 2 - Practical Combinatorial (t-way) Methods for Detecting Complex Fa...
Tutorial 2 - Practical Combinatorial (t-way) Methods for Detecting Complex Fa...Tutorial 2 - Practical Combinatorial (t-way) Methods for Detecting Complex Fa...
Tutorial 2 - Practical Combinatorial (t-way) Methods for Detecting Complex Fa...
 
Traceability - Structural Conformance Checking with Design Tests: An Evaluati...
Traceability - Structural Conformance Checking with Design Tests: An Evaluati...Traceability - Structural Conformance Checking with Design Tests: An Evaluati...
Traceability - Structural Conformance Checking with Design Tests: An Evaluati...
 
Natural Language Analysis - Expanding Identifiers to Normalize Source Code Vo...
Natural Language Analysis - Expanding Identifiers to Normalize Source Code Vo...Natural Language Analysis - Expanding Identifiers to Normalize Source Code Vo...
Natural Language Analysis - Expanding Identifiers to Normalize Source Code Vo...
 
Reliability and Quality - Predicting post-release defects using pre-release f...
Reliability and Quality - Predicting post-release defects using pre-release f...Reliability and Quality - Predicting post-release defects using pre-release f...
Reliability and Quality - Predicting post-release defects using pre-release f...
 
Faults and Regression testing - Localizing Failure-Inducing Program Edits Bas...
Faults and Regression testing - Localizing Failure-Inducing Program Edits Bas...Faults and Regression testing - Localizing Failure-Inducing Program Edits Bas...
Faults and Regression testing - Localizing Failure-Inducing Program Edits Bas...
 
ERA - Tracking Technical Debt
ERA - Tracking Technical DebtERA - Tracking Technical Debt
ERA - Tracking Technical Debt
 

Dernier

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 

Dernier (20)

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 

Natural Language Analysis - Mining Java Class Naming Conventions

  • 1. Mining Java Class Naming Conventions Simon Butler, Michel Wermelinger, Yijun Yu & Helen Sharp Centre for Research in Computing The Open University 27 September 2011 Centre for Research in Computing m.a.wermelinger@open.ac.uk Butler et al. (The Open University) Mining Java Class Naming Conventions 27 September 2011 1/7
  • 2. Class Identifier Names Despite the importance of class identifier names AbstractCollection Set knowledge of their structure is limited adjective ∗ noun + approximation found to be AbstractSet useful, but not universal What other part-of-speech patterns are commonly used? How are component words EnumSet HashSet TreeSet repeated? How often? Are there project-specific naming conventions? Butler et al. (The Open University) Mining Java Class Naming Conventions 27 September 2011 2/7
  • 3. Distribution of Java Classes in Inheritance Categories 0.7 0.6 Proportion of inheritance categories per project 0.5 0.4 0.3 0.2 0.1 0.0 E0I0 E0I1 E0In E1I0 E1I1 E1In Butler et al. (The Open University) Mining Java Class Naming Conventions 27 September 2011 3/7
  • 4. Part-of-Speech Patterns Relative frequency of most common PoS patterns noun + adjective + verb + noun + + adjective + noun noun + noun + E0 I 0 0.85 0.08 0.01 0.01 E0 I 1 0.73 0.15 0.02 0.02 E0 I n 0.75 0.15 0.03 0.01 E1 I 0 0.68 0.12 0.04 0.03 E1 I 1 0.70 0.15 0.04 0.02 E1 I n 0.75 0.14 0.04 0.02 4 basic patterns account for 90% of class identifier names 85% of E0 I0 class identifier names are composed of nouns The adjective ∗ noun + approximation includes 85% of class identifier names Butler et al. (The Open University) Mining Java Class Naming Conventions 27 September 2011 4/7
  • 5. Component Word Inheritance Relative frequency distribution of name inheritance Super Class Name Interface Name Category All Fragment All Fragment Both E0 I1 - - 0.39 0.37 - E0 In - - 0.38 0.40 - E1 I0 0.23 0.58 - - - E1 I1 0.14 0.53 0.24 0.21 0.27 E1 In 0.11 0.50 0.15 0.25 0.18 Fragments of super class name most commonly repeated Most common patterns: E0 I1 & E0 I1 : noun + interface name , noun + interface fragment E1 I0 : noun + super class fragment , noun + super class name E1 I1 & E1 In : noun + super class fragment , interface name super class fragment , noun + super class name Butler et al. (The Open University) Mining Java Class Naming Conventions 27 September 2011 5/7
  • 6. Case Study - Freemind 652 class identifier names 53 (8%) with uncommon PoS patterns Each class inspected with questions: 1. Is the class identifier name a clear description of the class? 2. Can the class identifier name be refactored to a more common PoS pattern? 3. Can the class be refactored into classes that could be more conventionally named? We found: Class identifier names describing GUI actions initiated by the user, e.g. SelectAllAction ( verb determiner noun ) Class identifier names that conform to local naming conventions 7 class identifier names were candidates for name refactoring 1 class was a candidate for refactoring Butler et al. (The Open University) Mining Java Class Naming Conventions 27 September 2011 6/7
  • 7. Conclusions Contributions Identification of common PoS structures found in praxis Identification of common patterns of component word repetition Unconventional class names: may conform to local naming conventions may be candidates for refactoring may indicate smells Practical Applications Recovery of class naming conventions Identification of unconventionally named classes Class identifier name recommendation systems Butler et al. (The Open University) Mining Java Class Naming Conventions 27 September 2011 7/7