SlideShare une entreprise Scribd logo
1  sur  11
Télécharger pour lire hors ligne
Do Comments Explain Code Adequately?
    -Investigation by Text Filtering-


    Yukinao Hirata, Osamu Mizuno
     Kyoto Institute of Technology	



                                       1
Brief Summary	

    We investigated in which projects (Eclipse
     and Netbeans) comment explains code
     adequately.
         By probability of fault-proneness using text-
          filtering method for comment and code.
    We concluded that Eclipse has more
     adequate comment.


                                                          2
Introduction

    The comment in the program is used for
     coding and maintenance.
    The wrong description in comments may
     cause further bugs and confusion in
     maintenance.
    A method to investigate whether comments
     are adequate is necessary.


                                                3
Fault-prone Filtering

    Fault-prone filtering [1] is a fault-prone
     module detection method inspired by spam
     mail filtering.	
               module M
                      	
                                        spam filter
                                                  	
                                                           Probability to faulty	




[1] O. Mizuno and T. Kikuno. Training on errors experiment to detect fault-prone software
modules by spam filter. In Proc. of 6th joint meeting of the European software engineering
conference and the ACM SIGSOFT symposium on the foundations of software engineering, 4
pages 405–414, 2007.
Definition of Comment lines	

    There are three types of comments in Java.
         /* comment */
         /** comment */
         // comment


 In this research …
 Comment line includes all types of comments
Experiments

1.      Extract comment and code from module
2.      Apply comment and code through spam filter
3.      Calculate distance between code and comment
                       comment
                             	
   spam filter	


      module 	
                       studied by
                                      older modules	
                       code
                                  spam filter	


                  1.                    2.              3.
                                                             6
Distance between Code and Comment

    A distance between code and comment is defined
     as follow:




    We assume that D(M) shows the degree of
     adequateness of the comment for code.            7
Target Projects
          Table 1: The group 1 of mining challenge
Project                    Eclipse                 Netbeans
Number of
                         308,966                  393,531
modules
Number of
                         159,818     (51.7%)      165,560    (42.1%)
faulty modules
Number of non-
                         149,148     (48.3%)       227,971   (57.9%)
faulty modules
Total lines         159,175,979                130,619,502

Comment lines         52,417,180     (32.9%)    36,272,795   (27.8%)

Code lines          106,758,799      (67.1%)    94,346,707   (72.2%) 8
Results
       Table 2: Prediction Results in Eclipse and Netbeans
                        Eclipse                Netbeans
               Accuracy Precision Recall Accuracy Precision Recall
Comments         0.651        0.947 0.345     0.658     0.944 0.199

Code             0.838        0.927 0.746     0.857     0.857 0.857

       Table 3: Average Distance in Eclipse and Netbeans
                      Eclipse                  Netbeans
             Correspondence    Average      Correspondence Average
Comments
  vs Code             0.738         0.256             0.707      0.290
                                                                      9
Conclusions

    Comments in Eclipse is more adequately
     described than in Netbeans




                                              10
The end	




            11

Contenu connexe

Similaire à Mining Challenge 2011 by Yukinao Hirata

Software architacture recovery
Software architacture recoverySoftware architacture recovery
Software architacture recoveryImdad Ul Haq
 
The Last Line Effect
The Last Line EffectThe Last Line Effect
The Last Line EffectAndrey Karpov
 
Robust Malware Detection using Residual Attention Network
Robust Malware Detection using Residual Attention NetworkRobust Malware Detection using Residual Attention Network
Robust Malware Detection using Residual Attention NetworkShamika Ganesan
 
THE REMOVAL OF NUMERICAL DRIFT FROM SCIENTIFIC MODELS
THE REMOVAL OF NUMERICAL DRIFT FROM SCIENTIFIC MODELSTHE REMOVAL OF NUMERICAL DRIFT FROM SCIENTIFIC MODELS
THE REMOVAL OF NUMERICAL DRIFT FROM SCIENTIFIC MODELSIJSEA
 
A fast static analysis approach to detect exploit code inside network flows
A fast static analysis approach to detect exploit code inside network flowsA fast static analysis approach to detect exploit code inside network flows
A fast static analysis approach to detect exploit code inside network flowsUltraUploader
 
Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...
Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...
Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...IRJET Journal
 
Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...
Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...
Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...IRJET Journal
 
Cyclomatic complexity
Cyclomatic complexityCyclomatic complexity
Cyclomatic complexityAngelo Trozzo
 
Advanced computer architecture
Advanced computer architectureAdvanced computer architecture
Advanced computer architecturevamsi krishna
 
An Efficient Framework for Detection & Classification of IoT BotNet.pptx
An Efficient Framework for Detection & Classification of IoT BotNet.pptxAn Efficient Framework for Detection & Classification of IoT BotNet.pptx
An Efficient Framework for Detection & Classification of IoT BotNet.pptxSandeep Maurya
 
A method for detecting obfuscated calls in malicious binaries
A method for detecting obfuscated calls in malicious binariesA method for detecting obfuscated calls in malicious binaries
A method for detecting obfuscated calls in malicious binariesUltraUploader
 
Finding Bad Code Smells with Neural Network Models
Finding Bad Code Smells with Neural Network Models Finding Bad Code Smells with Neural Network Models
Finding Bad Code Smells with Neural Network Models IJECEIAES
 
Crosscutting Specification Interference Detection at Aspect Oriented UML-Base...
Crosscutting Specification Interference Detection at Aspect Oriented UML-Base...Crosscutting Specification Interference Detection at Aspect Oriented UML-Base...
Crosscutting Specification Interference Detection at Aspect Oriented UML-Base...IJERA Editor
 
IRJET- Code Cloning using Abstract Syntax Tree
IRJET- Code Cloning using Abstract Syntax TreeIRJET- Code Cloning using Abstract Syntax Tree
IRJET- Code Cloning using Abstract Syntax TreeIRJET Journal
 
Multi step automated refactoring for code smell
Multi step automated refactoring for code smellMulti step automated refactoring for code smell
Multi step automated refactoring for code smelleSAT Publishing House
 
Multi step automated refactoring for code smell
Multi step automated refactoring for code smellMulti step automated refactoring for code smell
Multi step automated refactoring for code smelleSAT Journals
 
Automatic binary deobfuscation
Automatic binary deobfuscationAutomatic binary deobfuscation
Automatic binary deobfuscationUltraUploader
 

Similaire à Mining Challenge 2011 by Yukinao Hirata (20)

Software architacture recovery
Software architacture recoverySoftware architacture recovery
Software architacture recovery
 
The Last Line Effect
The Last Line EffectThe Last Line Effect
The Last Line Effect
 
Robust Malware Detection using Residual Attention Network
Robust Malware Detection using Residual Attention NetworkRobust Malware Detection using Residual Attention Network
Robust Malware Detection using Residual Attention Network
 
4213ijsea04
4213ijsea044213ijsea04
4213ijsea04
 
THE REMOVAL OF NUMERICAL DRIFT FROM SCIENTIFIC MODELS
THE REMOVAL OF NUMERICAL DRIFT FROM SCIENTIFIC MODELSTHE REMOVAL OF NUMERICAL DRIFT FROM SCIENTIFIC MODELS
THE REMOVAL OF NUMERICAL DRIFT FROM SCIENTIFIC MODELS
 
A fast static analysis approach to detect exploit code inside network flows
A fast static analysis approach to detect exploit code inside network flowsA fast static analysis approach to detect exploit code inside network flows
A fast static analysis approach to detect exploit code inside network flows
 
Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...
Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...
Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...
 
Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...
Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...
Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...
 
Cyclomatic complexity
Cyclomatic complexityCyclomatic complexity
Cyclomatic complexity
 
Advanced computer architecture
Advanced computer architectureAdvanced computer architecture
Advanced computer architecture
 
An Efficient Framework for Detection & Classification of IoT BotNet.pptx
An Efficient Framework for Detection & Classification of IoT BotNet.pptxAn Efficient Framework for Detection & Classification of IoT BotNet.pptx
An Efficient Framework for Detection & Classification of IoT BotNet.pptx
 
A method for detecting obfuscated calls in malicious binaries
A method for detecting obfuscated calls in malicious binariesA method for detecting obfuscated calls in malicious binaries
A method for detecting obfuscated calls in malicious binaries
 
C04701019027
C04701019027C04701019027
C04701019027
 
Finding Bad Code Smells with Neural Network Models
Finding Bad Code Smells with Neural Network Models Finding Bad Code Smells with Neural Network Models
Finding Bad Code Smells with Neural Network Models
 
Crosscutting Specification Interference Detection at Aspect Oriented UML-Base...
Crosscutting Specification Interference Detection at Aspect Oriented UML-Base...Crosscutting Specification Interference Detection at Aspect Oriented UML-Base...
Crosscutting Specification Interference Detection at Aspect Oriented UML-Base...
 
IRJET- Code Cloning using Abstract Syntax Tree
IRJET- Code Cloning using Abstract Syntax TreeIRJET- Code Cloning using Abstract Syntax Tree
IRJET- Code Cloning using Abstract Syntax Tree
 
Ijetr012045
Ijetr012045Ijetr012045
Ijetr012045
 
Multi step automated refactoring for code smell
Multi step automated refactoring for code smellMulti step automated refactoring for code smell
Multi step automated refactoring for code smell
 
Multi step automated refactoring for code smell
Multi step automated refactoring for code smellMulti step automated refactoring for code smell
Multi step automated refactoring for code smell
 
Automatic binary deobfuscation
Automatic binary deobfuscationAutomatic binary deobfuscation
Automatic binary deobfuscation
 

Mining Challenge 2011 by Yukinao Hirata

  • 1. Do Comments Explain Code Adequately? -Investigation by Text Filtering- Yukinao Hirata, Osamu Mizuno Kyoto Institute of Technology 1
  • 2. Brief Summary   We investigated in which projects (Eclipse and Netbeans) comment explains code adequately.   By probability of fault-proneness using text- filtering method for comment and code.   We concluded that Eclipse has more adequate comment. 2
  • 3. Introduction   The comment in the program is used for coding and maintenance.   The wrong description in comments may cause further bugs and confusion in maintenance.   A method to investigate whether comments are adequate is necessary. 3
  • 4. Fault-prone Filtering   Fault-prone filtering [1] is a fault-prone module detection method inspired by spam mail filtering. module M spam filter Probability to faulty [1] O. Mizuno and T. Kikuno. Training on errors experiment to detect fault-prone software modules by spam filter. In Proc. of 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on the foundations of software engineering, 4 pages 405–414, 2007.
  • 5. Definition of Comment lines   There are three types of comments in Java.   /* comment */   /** comment */   // comment In this research … Comment line includes all types of comments
  • 6. Experiments 1.  Extract comment and code from module 2.  Apply comment and code through spam filter 3.  Calculate distance between code and comment comment spam filter module studied by older modules code spam filter 1. 2. 3. 6
  • 7. Distance between Code and Comment   A distance between code and comment is defined as follow:   We assume that D(M) shows the degree of adequateness of the comment for code. 7
  • 8. Target Projects Table 1: The group 1 of mining challenge Project Eclipse Netbeans Number of 308,966 393,531 modules Number of 159,818 (51.7%) 165,560 (42.1%) faulty modules Number of non- 149,148 (48.3%) 227,971 (57.9%) faulty modules Total lines 159,175,979 130,619,502 Comment lines 52,417,180 (32.9%) 36,272,795 (27.8%) Code lines 106,758,799 (67.1%) 94,346,707 (72.2%) 8
  • 9. Results Table 2: Prediction Results in Eclipse and Netbeans Eclipse Netbeans Accuracy Precision Recall Accuracy Precision Recall Comments 0.651 0.947 0.345 0.658 0.944 0.199 Code 0.838 0.927 0.746 0.857 0.857 0.857 Table 3: Average Distance in Eclipse and Netbeans Eclipse Netbeans Correspondence Average Correspondence Average Comments vs Code 0.738 0.256 0.707 0.290 9
  • 10. Conclusions   Comments in Eclipse is more adequately described than in Netbeans 10
  • 11. The end 11