SlideShare une entreprise Scribd logo
1  sur  11
Télécharger pour lire hors ligne
International Journal of Computer and Technology (IJCET), ISSN 0976 – 6367(Print),
International Journal of Computer Engineering Engineering
ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME
and Technology (IJCET), ISSN 0976 – 6367(Print)
ISSN 0976 – 6375(Online) Volume 1                                       IJCET
Number 1, May - June (2010), pp. 261-271                            ©IAEME
© IAEME, http://www.iaeme.com/ijcet.html


       ART OF SOFTWARE DEFECT ASSOCIATION &
    CORRECTION USING ASSOCIATION RULE MINING
               Nilamadhab Mishra, MCA, M.PHIL (C S), M.Tech (CSE)
                    Asst.prof, Krupajal Group of Institution, Orissa
                     E-mail : nilamadhab.mishra@rediffmail.com

ABSTRACT:
       Much current software defect prediction work focuses on the number of defects
remaining in a software system. In this paper, I focus about association rule mining based
methods to predict defect associations and defect correction effort. This is to help
developers detect software defects and assist project managers in allocating testing
resources more effectively. Software process improvement method is based on the
analysis of defect data. I applied the proposed methods to the SEL defect data consisting
of more than 200 projects over more than 15 years. The method is classification of
software defects using attributes which relate defects to specific process activities. The
results show that, for defect association prediction, the accuracy is very high and the
false-negative rate is very low. The comparison of defect correction effort prediction
method with other types of methods—PART, C4.5, and Naı¨ve Bays show that accuracy
can be improved. I also evaluate the impact of support and confidence levels on
prediction accuracy, false negative rate, false-positive rate, and the number of rules. Thus
higher support and confidence levels may not result in higher prediction accuracy and a
sufficient number of rules must be a precondition for high prediction accuracy.
Key words: Defect association, defect correction effort, defect isolation effort, defect
association prediction, Software defect prediction
1. INTRODUCTION
       The success of a software system depends not only on cost and schedule, but also
on quality. Among many software quality characteristics, residual defects have become
the de facto industry standard [1, 2]. Therefore, the prediction of software defects, i.e.,



                                            261
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME


deviations from specifications or expectations which might lead to failures in operation,
has been an important research topic in the field of software engineering for more than 30
years. Clearly, they are a proxy for reliability, but, unfortunately, reliability is extremely
difficult to assess prior to full deployment. Current defect prediction work focuses on
estimating the number of defects remaining in software systems with code metrics,
inspection data, and process-quality data by statistical approaches, capture-recapture
(CR) models, and detection profile methods (DPM)[4 ].
        The prediction result, which is the number of defects remaining in a software
system, can be used as an important measure for the software developer, and can be used
to control the software process         and gauge the likely delivered quality of a software
system. In contrast, propose that the defects found during production are a manifestation
of process deficiencies, so they present a case study of the use of a defect based method
for software in-process improvement. In particular, they use an attribute-focusing (AF)
method to discover associations among defect attributes such as defect type, source, and
phase introduced, phase found, component, impact, etc [9]. By finding out the event that
could have led to the associations, they identify a process problem and implement a
corrective action. This can lead a project team to improve its process during
development. I restrict my work to the predictions of defect (type) associations and
corresponding defect correction effort.
1. For the given defect(s), what other defect(s) may co-occur?
2. In order to correct the defect(s), how much effort will be consumed?
        I use defect type data to predict software defect associations that are the relations
among different defect types such as: If defects a and b occur, then defect c also will
occur. This is formally written as a ^ b => c. The defect Associations can be used for
three purposes [3, 4].
        First, find as many related defects as possible to the detected defect(s) and,
consequently, make more-effective corrections to the software. For example, consider the
situation where we have classes of defect a, b, and c and suppose the rule a ^ b => c has
been obtained from a historical data set, and the defects of class a and b have been
detected occurring together, but no defect of class c has yet been discovered. The rule
indicates that a defect of class c is likely to have occurred as well and indicates that we


                                                262
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME


should check the corresponding software artifact to see whether or not such a defect
really exists. If the result is positive, the search can be continued, if rule a ^ b ^ c =>d
holds as well, we can do the same thing for defect d.
        Second, help evaluate reviewers’ results during an inspection. For example, if rule
a ^ b => c holds but a reviewer has only found defects a and b, it is possible that he
missed defect c. Thus, a recommendation might be that his/her work should be reinserted
for completeness.
        Third, to assist managers in improving the software process through analysis of
the reasons some defects Frequently occur together. If the analysis leads to the
identification of a process problem, managers have to come up with a corrective action.
        At the same time, for each of the associated defects, we also predict the likely
effort required to isolate and correct it. This can be used to help project managers
improve control of project schedules. Both of our defect association prediction and defect
correction effort prediction methods are based on the association rule mining method
which was first explored by Agrawal. Association rule mining aims to discover the
patterns of co-occurrences of the attributes in a database [5, 6]. However, it must be
stressed that associations do not imply causality. An association rule is an expression A--
>C, where A (Antecedent) and C (Consequent) are sets of items. The meaning of such
rules is quite intuitive: Given a database D of transactions, where each transaction T 2D
is a set of items, A-->C expresses that whenever a transaction T contains A, then T also
contains C with a specified confidence. The rule confidence is defined as the percentage
of transactions containing C in addition to A with regard to the overall number of
transactions containing A. The idea of mining association rules originates from the
analysis of market-basket data where rules like “customers who buy products p1 and p2
will also buy product p3 with probability c percent” are extracted. Their direct
applicability to business problems together with their inherent understandability make
association rule mining a popular data mining method.




                                                263
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME



2. PAPER CONTENTS
General method:
        The objective of the study is to discover software defect associations from
historical software engineering data sets, and help determine whether or not a defect(s) is
accompanied by other defect(s). If so, we attempt to determine what these defects are and
how much effort might be expected to be used when we correct them. Finally, we aim to
help detect software defects and effectively improve software control. For this purpose,
first, we preprocessed the NASA SEL data set (see the following section for details), and
obtained three data sets: the defect data set, the defect isolation effort data set, and the
defect correction effort data set. Then, for each of these data sets, we randomly extracted
five pairs of training and test data sets as the basis of the research. After that, we used
association rule mining based methods to discover defect associations and the patterns
between a defect and the corresponding defect isolation/correction effort. Finally, we
predicted the defect(s) attached to the given defect(s) and the effort used to correct each
of these defects. We also compared the results with alternative methods where applicable.
Data Source and Data Extraction:
        The data we used is SEL Data which is a subset of the online database created by
the NASA/GSFC Software Engineering Laboratory (SEL) for storage and retrieval of
software engineering data for NASA Goddard Space Flight Center. The subset includes
defect data of more than 200 projects completed over more than 15 years. The SEL Data
is a database consisting of 15 tables that provide data on the projects’ software
characteristics (overall and at a component level), changes and errors during all phases of
development, and effort and computer resources used. In the SEL Data, the defects are
divided into six types, Table 1 contains the details. In addition, the effort used to correct
defects falls into four categories: One Hour or Less, One Hour to One Day, One Day to
Three Days, and More Than Three Days.
Analysis approach:
        I use the five-fold cross-validation method as the overall analysis approach. That
is, for each D of the defect data set, the defect isolation effort data set, and the defect
correction effort data set, the inducer is trained and tested a total of five times. I use the


                                                264
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME


association rule mining method to learn rules from the training data sets. For defect
association prediction, the rule learning is straightforward, while for defect correction
effort prediction, it is more complicated because the consequent of a rule has to be defect
correction effort [7, 8]. Considering the target of association rule mining is not
predetermined and classification rule mining has only one predetermined target, the class,
I integrate these two techniques to learn defect correction effort prediction rules by
focusing on a special subset of association rules whose consequents are restricted to the
special attribute, the effort. Once we obtain the rules, we rank them, and use them to
predict the defect associations and defect correction effort with the corresponding test
data sets. The predictions of defect associations and defect correction effort are both
based on the length-first (in terms of the number of items in a rule) strategy.
Association Rule Discovery:
         Both positive and negative association rules are to be discovered by the project.
First, the frequent item sets of potential interest and infrequent item sets of potential
interest are generated in a database. Then Extract positive rules of the form A -->B, and
negative rules of the forms A-->¬B, ¬A-->B, and ¬A             ¬B.
Rule ranking Strategy:
        Before prediction, we rank the discovered rules according to the length-first
strategy. The length-first strategy was used for two reasons. First, for the defect
association prediction, the length-first strategy enables us to find out as many defects as
possible that coincide with known defect(s), thus preventing errors due to incomplete
discovery of defect associations. Second, for the defect correction effort prediction, the
length-first strategy enables us to obtain more-accurate rules, thus improving the effort
prediction accuracy. Specifically, the length-first rule-ranking strategy is as follows:
1. Rank rules according to their length. The longer the rules, the higher the priority.
2. If two rules have the same length, rank them according to their confidence values. The
greater the confidence values, the higher the priority. The more-confident rules have
more predictive power in terms of accuracy; thus, they should have higher priority.




                                                265
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME


3. If two rules have the same confidence values, rank them according to their support
values. The higher the support values, the higher the priority. The rules with higher
support value are more statistically Significant, so they should have higher priority.
4. If two rules have the same support value, rank them in alphabetical order.
2.1 GRAPHS, TABLES, AND PHOTOGRAPHS
Defect Types:
        Here we are concentrated on some specified defects that are the most frequent
defects that may occur in the Sel data.
Defect Types:




                                                266
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME



Rule Ranking Algorithm:




Defect Isolation and Correction Effort Prediction:
The constraint-based association rules mining method is used for the purpose of defect
isolation and correction prediction [10]. The steps are
1. Compute the frequent item sets that occur together in the training data set at            least as
frequently as a predetermined min: Supp. The item sets mined must also contain the
effort labels
2. Generate association rules from the frequent item sets, where the             consequent of the
rules is the effort. In addition to the min: Supp threshold, these rules must also satisfy a
minimum confidence.


                                                267
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME




Effort Prediction Rules:




Defect association prediction procedure.
        We use the association rule mining method to learn rules from the training data
sets. For defect association prediction, the rule learning is straightforward, while for
defect correction effort prediction, it is more complicated because the consequent of a


                                                268
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME


rule has to be defect correction effort. Considering the target of association rule mining is
not predetermined and classification rule mining has only one predetermined target, the
class, we integrate these two techniques to learn defect correction effort prediction rules
by focusing on a special subset of association rules whose consequents are restricted to
the special attribute, the effort. Once we obtain the rules, we rank them, and use them to
predict the defect associations and defect correction effort with the corresponding test
data sets. The predictions of defect associations and defect correction effort are both
based on the length-first strategy.
        There is no work on software defect association prediction and there is no other
method that can be used for this purpose. Therefore, I am unable to directly compare our
defect association prediction method with other studies. As the defect correction effort is
represented in the form of categorical values and there are some attributes to characterize
it, it can be viewed as a classification problem. This allows us to compare our defect
correction prediction method with three different types of methods. These methods are
the Bayesian rule of conditional probability-based method, Naı¨ve Bayes [, the well-
known trees-based method, C4.5, and the simple and effective rules-based method,
PART.
3. CONCLUSION
        In this paper, I have given an application of association rule mining to predict
software defect associations and defect correction effort with SEL defect data. This is
important in order to help developers detect software defects and project managers
improve software control and allocate their testing resources effectively. The ideas have
been tested using the NASA SEL defect data set. From this, I extracted defect data and
the corresponding defect isolation and correction effort data.
        I have also compared the defect correction effort prediction method with three
other types of machine learning methods, namely, PART, C4.5, and Naı¨ve Bayes. The
results show that for defect isolation effort, our accuracy is higher than for the other three
methods b. Likewise, for defect correction effort prediction, the accuracy is higher than
the other three methods.




                                                269
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME


        I also have explored the impact of support and confidence levels on prediction
accuracy, false negative rate, false positive rate, and the number of rules as well. I found
that higher support and confidence levels may not result in higher prediction accuracy,
and a sufficient number of rules is a precondition for high prediction accuracy. While we
do not wish to draw strong conclusions from a single data set study, i believe that my
results suggest that association rule mining may be an attractive technique to the software
engineering community due to its relative simplicity, transparency, and seeming
effectiveness in constructing prediction systems.
REFERENCES:
[1]. Inderpal S. Bhandari, M. J. Halliday, E. Tarver, D. Brown, J. Chaar, and R.
Chillarence. A Case Study of Software Process Improvement during Development. IEEE
Trans. on Soft. Eng., 19(12), pp. 1157-1170, December 1993.
[2] Inderpal S. Bhandari, M. J. Halliday, J. Chaar, R. Chillarence, K. Jones, J. S.
Atkinson, C. Lepori-Costello, P. Y. Jasper, E. D. Tarver, C. C. Lewis, and M. Yonezawa.
In-Process Improvement through Defect Data Interpretation. IBM Syst. Journal, 33(1),
pp. 182-214, 1994.
[3] Xindong Wu, Chengqi Zhang, Shichao Zhang,” Efficient Mining of both positive and
negative association rules”, ACM Transactions on Information Systems, Vol. 22, No. 3,
July 2004, Pages 381–405.
[4] Qinbao Song, Martin Shepperd, Michelle Cartwright, Carolyn Mair, "Software Defect
Association Mining and Defect Correction Effort Prediction," IEEE Transactions on
Software Engineering, vol. 32, no. 2, pp. 69-82, Feb., 2006.
[5] I.S. Bhandari, M.J. Halliday, J. Chaar, R. Chillarenge, K. Jones, J.S. Atkinson, C.
Lepori-Costello, P.Y. Jasper, E.D. Tarver, C.C. Lewis, and M. Yonezawa, “In Process
Improvement through Defect Data Interpretation,” IBM System J., vol. 33, no. 1, pp.
182-214, 1994.
[6] L.C. Briand, K. El-Emam, B.G. Freimut, and O. Laitenberger, “A Comprehensive
Evaluation of Capture-Recapture Models forEstimating Software Defect Content,” IEEE
Trans. Software Eng., vol. 26, no. 6, pp. 518-540, June 2000.




                                                270
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME


[7] T. Compton and C. Withrow, “Prediction and Control of Ada Software Defects,” J.
Systems and Software, vol. 12, pp. 199-207, 1990.
[8] F. Brito e Abreu, Melo, W., "Evaluating the Impact of Object-Oriented Design on
Software Quality", Proceedings of Third International Software Metrics Symposium, pp.
90- 99, 1996.
[9] S. R. Chidamber and C. F. Kemerer, "A Metrics Suite for Object Oriented Design",
IEEE Transactions on Software Engineering, 20(6), pp. 476-493, 1994.
[10] D. ubrani , Murphy, G.C., "Hipikat: recommending pertinent software development
artifacts", Proceedings of International Conference on Software Engineering, pp. 408-
418, 2003.




                                                271

Contenu connexe

Tendances

Towards formulating dynamic model for predicting defects in system testing us...
Towards formulating dynamic model for predicting defects in system testing us...Towards formulating dynamic model for predicting defects in system testing us...
Towards formulating dynamic model for predicting defects in system testing us...Journal Papers
 
IRJET- Data Reduction in Bug Triage using Supervised Machine Learning
IRJET- Data Reduction in Bug Triage using Supervised Machine LearningIRJET- Data Reduction in Bug Triage using Supervised Machine Learning
IRJET- Data Reduction in Bug Triage using Supervised Machine LearningIRJET Journal
 
Application of Genetic Algorithm in Software Engineering: A Review
Application of Genetic Algorithm in Software Engineering: A ReviewApplication of Genetic Algorithm in Software Engineering: A Review
Application of Genetic Algorithm in Software Engineering: A ReviewIRJESJOURNAL
 
Parameter Estimation of GOEL-OKUMOTO Model by Comparing ACO with MLE Method
Parameter Estimation of GOEL-OKUMOTO Model by Comparing ACO with MLE MethodParameter Estimation of GOEL-OKUMOTO Model by Comparing ACO with MLE Method
Parameter Estimation of GOEL-OKUMOTO Model by Comparing ACO with MLE MethodIRJET Journal
 
Software testing effort estimation with cobb douglas function a practical app...
Software testing effort estimation with cobb douglas function a practical app...Software testing effort estimation with cobb douglas function a practical app...
Software testing effort estimation with cobb douglas function a practical app...eSAT Publishing House
 
Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...
Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...
Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...Chakkrit (Kla) Tantithamthavorn
 
The End User Requirement for Project Management Software Accuracy
The End User Requirement for Project Management Software Accuracy The End User Requirement for Project Management Software Accuracy
The End User Requirement for Project Management Software Accuracy IJECEIAES
 
Determination of Optimum Parameters Affecting the Properties of O Rings
Determination of Optimum Parameters Affecting the Properties of O RingsDetermination of Optimum Parameters Affecting the Properties of O Rings
Determination of Optimum Parameters Affecting the Properties of O RingsIRJET Journal
 
BIO-INSPIRED MODELLING OF SOFTWARE VERIFICATION BY MODIFIED MORAN PROCESSES
BIO-INSPIRED MODELLING OF SOFTWARE VERIFICATION BY MODIFIED MORAN PROCESSESBIO-INSPIRED MODELLING OF SOFTWARE VERIFICATION BY MODIFIED MORAN PROCESSES
BIO-INSPIRED MODELLING OF SOFTWARE VERIFICATION BY MODIFIED MORAN PROCESSESIJCSEA Journal
 
A Survey of Software Reliability factor
A Survey of Software Reliability factorA Survey of Software Reliability factor
A Survey of Software Reliability factorIOSR Journals
 
Software Defect Prediction Using Radial Basis and Probabilistic Neural Networks
Software Defect Prediction Using Radial Basis and Probabilistic Neural NetworksSoftware Defect Prediction Using Radial Basis and Probabilistic Neural Networks
Software Defect Prediction Using Radial Basis and Probabilistic Neural NetworksEditor IJCATR
 

Tendances (15)

Towards formulating dynamic model for predicting defects in system testing us...
Towards formulating dynamic model for predicting defects in system testing us...Towards formulating dynamic model for predicting defects in system testing us...
Towards formulating dynamic model for predicting defects in system testing us...
 
IRJET- Data Reduction in Bug Triage using Supervised Machine Learning
IRJET- Data Reduction in Bug Triage using Supervised Machine LearningIRJET- Data Reduction in Bug Triage using Supervised Machine Learning
IRJET- Data Reduction in Bug Triage using Supervised Machine Learning
 
Application of Genetic Algorithm in Software Engineering: A Review
Application of Genetic Algorithm in Software Engineering: A ReviewApplication of Genetic Algorithm in Software Engineering: A Review
Application of Genetic Algorithm in Software Engineering: A Review
 
Parameter Estimation of GOEL-OKUMOTO Model by Comparing ACO with MLE Method
Parameter Estimation of GOEL-OKUMOTO Model by Comparing ACO with MLE MethodParameter Estimation of GOEL-OKUMOTO Model by Comparing ACO with MLE Method
Parameter Estimation of GOEL-OKUMOTO Model by Comparing ACO with MLE Method
 
O0181397100
O0181397100O0181397100
O0181397100
 
Software testing effort estimation with cobb douglas function a practical app...
Software testing effort estimation with cobb douglas function a practical app...Software testing effort estimation with cobb douglas function a practical app...
Software testing effort estimation with cobb douglas function a practical app...
 
Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...
Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...
Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...
 
Kumar2021
Kumar2021Kumar2021
Kumar2021
 
D0423022028
D0423022028D0423022028
D0423022028
 
50120140501001
5012014050100150120140501001
50120140501001
 
The End User Requirement for Project Management Software Accuracy
The End User Requirement for Project Management Software Accuracy The End User Requirement for Project Management Software Accuracy
The End User Requirement for Project Management Software Accuracy
 
Determination of Optimum Parameters Affecting the Properties of O Rings
Determination of Optimum Parameters Affecting the Properties of O RingsDetermination of Optimum Parameters Affecting the Properties of O Rings
Determination of Optimum Parameters Affecting the Properties of O Rings
 
BIO-INSPIRED MODELLING OF SOFTWARE VERIFICATION BY MODIFIED MORAN PROCESSES
BIO-INSPIRED MODELLING OF SOFTWARE VERIFICATION BY MODIFIED MORAN PROCESSESBIO-INSPIRED MODELLING OF SOFTWARE VERIFICATION BY MODIFIED MORAN PROCESSES
BIO-INSPIRED MODELLING OF SOFTWARE VERIFICATION BY MODIFIED MORAN PROCESSES
 
A Survey of Software Reliability factor
A Survey of Software Reliability factorA Survey of Software Reliability factor
A Survey of Software Reliability factor
 
Software Defect Prediction Using Radial Basis and Probabilistic Neural Networks
Software Defect Prediction Using Radial Basis and Probabilistic Neural NetworksSoftware Defect Prediction Using Radial Basis and Probabilistic Neural Networks
Software Defect Prediction Using Radial Basis and Probabilistic Neural Networks
 

En vedette

Speech processing strategies for cochlear prostheses the past, present and fu...
Speech processing strategies for cochlear prostheses the past, present and fu...Speech processing strategies for cochlear prostheses the past, present and fu...
Speech processing strategies for cochlear prostheses the past, present and fu...iaemedu
 
Extraction of qrs complexes using automated bayesian regularization neural ne...
Extraction of qrs complexes using automated bayesian regularization neural ne...Extraction of qrs complexes using automated bayesian regularization neural ne...
Extraction of qrs complexes using automated bayesian regularization neural ne...iaemedu
 
Analysis of intelligent system design by neuro adaptive control
Analysis of intelligent system design by neuro adaptive controlAnalysis of intelligent system design by neuro adaptive control
Analysis of intelligent system design by neuro adaptive controliaemedu
 
Literature review of facial modeling and animation techniques
Literature review of facial modeling and animation techniquesLiterature review of facial modeling and animation techniques
Literature review of facial modeling and animation techniquesiaemedu
 
Finite element analysis and experimental investigations
Finite element analysis and experimental investigationsFinite element analysis and experimental investigations
Finite element analysis and experimental investigationsiaemedu
 
Optimal placement of custom power devices
Optimal placement of custom power devicesOptimal placement of custom power devices
Optimal placement of custom power devicesiaemedu
 
Relevance vector machine based prediction of mrrand sr for electro chemical m...
Relevance vector machine based prediction of mrrand sr for electro chemical m...Relevance vector machine based prediction of mrrand sr for electro chemical m...
Relevance vector machine based prediction of mrrand sr for electro chemical m...iaemedu
 
Supersonic particle deposition as potential corrosion treatment method for he...
Supersonic particle deposition as potential corrosion treatment method for he...Supersonic particle deposition as potential corrosion treatment method for he...
Supersonic particle deposition as potential corrosion treatment method for he...iaemedu
 
Facial expression using 3 d animation
Facial expression using 3 d animationFacial expression using 3 d animation
Facial expression using 3 d animationiaemedu
 

En vedette (9)

Speech processing strategies for cochlear prostheses the past, present and fu...
Speech processing strategies for cochlear prostheses the past, present and fu...Speech processing strategies for cochlear prostheses the past, present and fu...
Speech processing strategies for cochlear prostheses the past, present and fu...
 
Extraction of qrs complexes using automated bayesian regularization neural ne...
Extraction of qrs complexes using automated bayesian regularization neural ne...Extraction of qrs complexes using automated bayesian regularization neural ne...
Extraction of qrs complexes using automated bayesian regularization neural ne...
 
Analysis of intelligent system design by neuro adaptive control
Analysis of intelligent system design by neuro adaptive controlAnalysis of intelligent system design by neuro adaptive control
Analysis of intelligent system design by neuro adaptive control
 
Literature review of facial modeling and animation techniques
Literature review of facial modeling and animation techniquesLiterature review of facial modeling and animation techniques
Literature review of facial modeling and animation techniques
 
Finite element analysis and experimental investigations
Finite element analysis and experimental investigationsFinite element analysis and experimental investigations
Finite element analysis and experimental investigations
 
Optimal placement of custom power devices
Optimal placement of custom power devicesOptimal placement of custom power devices
Optimal placement of custom power devices
 
Relevance vector machine based prediction of mrrand sr for electro chemical m...
Relevance vector machine based prediction of mrrand sr for electro chemical m...Relevance vector machine based prediction of mrrand sr for electro chemical m...
Relevance vector machine based prediction of mrrand sr for electro chemical m...
 
Supersonic particle deposition as potential corrosion treatment method for he...
Supersonic particle deposition as potential corrosion treatment method for he...Supersonic particle deposition as potential corrosion treatment method for he...
Supersonic particle deposition as potential corrosion treatment method for he...
 
Facial expression using 3 d animation
Facial expression using 3 d animationFacial expression using 3 d animation
Facial expression using 3 d animation
 

Similaire à Art of software defect association & correction using association

Comparative performance analysis
Comparative performance analysisComparative performance analysis
Comparative performance analysiscsandit
 
Bug Triage: An Automated Process
Bug Triage: An Automated ProcessBug Triage: An Automated Process
Bug Triage: An Automated ProcessIRJET Journal
 
Using Fuzzy Clustering and Software Metrics to Predict Faults in large Indust...
Using Fuzzy Clustering and Software Metrics to Predict Faults in large Indust...Using Fuzzy Clustering and Software Metrics to Predict Faults in large Indust...
Using Fuzzy Clustering and Software Metrics to Predict Faults in large Indust...IOSR Journals
 
IRJET- A Novel Approach on Computation Intelligence Technique for Softwar...
IRJET-  	  A Novel Approach on Computation Intelligence Technique for Softwar...IRJET-  	  A Novel Approach on Computation Intelligence Technique for Softwar...
IRJET- A Novel Approach on Computation Intelligence Technique for Softwar...IRJET Journal
 
A simplified predictive framework for cost evaluation to fault assessment usi...
A simplified predictive framework for cost evaluation to fault assessment usi...A simplified predictive framework for cost evaluation to fault assessment usi...
A simplified predictive framework for cost evaluation to fault assessment usi...IJECEIAES
 
Abstract.doc
Abstract.docAbstract.doc
Abstract.docbutest
 
Fine–grained analysis and profiling of software bugs to facilitate waste iden...
Fine–grained analysis and profiling of software bugs to facilitate waste iden...Fine–grained analysis and profiling of software bugs to facilitate waste iden...
Fine–grained analysis and profiling of software bugs to facilitate waste iden...eSAT Publishing House
 
Survey on Software Data Reduction Techniques Accomplishing Bug Triage
Survey on Software Data Reduction Techniques Accomplishing Bug TriageSurvey on Software Data Reduction Techniques Accomplishing Bug Triage
Survey on Software Data Reduction Techniques Accomplishing Bug TriageIRJET Journal
 
A NOVEL APPROACH TO ERROR DETECTION AND CORRECTION OF C PROGRAMS USING MACHIN...
A NOVEL APPROACH TO ERROR DETECTION AND CORRECTION OF C PROGRAMS USING MACHIN...A NOVEL APPROACH TO ERROR DETECTION AND CORRECTION OF C PROGRAMS USING MACHIN...
A NOVEL APPROACH TO ERROR DETECTION AND CORRECTION OF C PROGRAMS USING MACHIN...IJCI JOURNAL
 
Software Defect Trend Forecasting In Open Source Projects using A Univariate ...
Software Defect Trend Forecasting In Open Source Projects using A Univariate ...Software Defect Trend Forecasting In Open Source Projects using A Univariate ...
Software Defect Trend Forecasting In Open Source Projects using A Univariate ...CSCJournals
 
survey on analysing the crash reports of software applications
survey on analysing the crash reports of software applicationssurvey on analysing the crash reports of software applications
survey on analysing the crash reports of software applicationsIRJET Journal
 
Determination of Software Release Instant of Three-Tier Client Server Softwar...
Determination of Software Release Instant of Three-Tier Client Server Softwar...Determination of Software Release Instant of Three-Tier Client Server Softwar...
Determination of Software Release Instant of Three-Tier Client Server Softwar...Waqas Tariq
 
A Review on Parameter Estimation Techniques of Software Reliability Growth Mo...
A Review on Parameter Estimation Techniques of Software Reliability Growth Mo...A Review on Parameter Estimation Techniques of Software Reliability Growth Mo...
A Review on Parameter Estimation Techniques of Software Reliability Growth Mo...Editor IJCATR
 
ENSEMBLE REGRESSION MODELS FOR SOFTWARE DEVELOPMENT EFFORT ESTIMATION: A COMP...
ENSEMBLE REGRESSION MODELS FOR SOFTWARE DEVELOPMENT EFFORT ESTIMATION: A COMP...ENSEMBLE REGRESSION MODELS FOR SOFTWARE DEVELOPMENT EFFORT ESTIMATION: A COMP...
ENSEMBLE REGRESSION MODELS FOR SOFTWARE DEVELOPMENT EFFORT ESTIMATION: A COMP...ijseajournal
 
Towards effective bug triage with software
Towards effective bug triage with softwareTowards effective bug triage with software
Towards effective bug triage with softwareNexgen Technology
 

Similaire à Art of software defect association & correction using association (20)

Comparative performance analysis
Comparative performance analysisComparative performance analysis
Comparative performance analysis
 
Bug Triage: An Automated Process
Bug Triage: An Automated ProcessBug Triage: An Automated Process
Bug Triage: An Automated Process
 
ONE HIDDEN LAYER ANFIS MODEL FOR OOS DEVELOPMENT EFFORT ESTIMATION
ONE HIDDEN LAYER ANFIS MODEL FOR OOS DEVELOPMENT EFFORT ESTIMATIONONE HIDDEN LAYER ANFIS MODEL FOR OOS DEVELOPMENT EFFORT ESTIMATION
ONE HIDDEN LAYER ANFIS MODEL FOR OOS DEVELOPMENT EFFORT ESTIMATION
 
F017652530
F017652530F017652530
F017652530
 
Using Fuzzy Clustering and Software Metrics to Predict Faults in large Indust...
Using Fuzzy Clustering and Software Metrics to Predict Faults in large Indust...Using Fuzzy Clustering and Software Metrics to Predict Faults in large Indust...
Using Fuzzy Clustering and Software Metrics to Predict Faults in large Indust...
 
IRJET- A Novel Approach on Computation Intelligence Technique for Softwar...
IRJET-  	  A Novel Approach on Computation Intelligence Technique for Softwar...IRJET-  	  A Novel Approach on Computation Intelligence Technique for Softwar...
IRJET- A Novel Approach on Computation Intelligence Technique for Softwar...
 
50320130403010
5032013040301050320130403010
50320130403010
 
A simplified predictive framework for cost evaluation to fault assessment usi...
A simplified predictive framework for cost evaluation to fault assessment usi...A simplified predictive framework for cost evaluation to fault assessment usi...
A simplified predictive framework for cost evaluation to fault assessment usi...
 
Abstract.doc
Abstract.docAbstract.doc
Abstract.doc
 
IJET-V2I6P28
IJET-V2I6P28IJET-V2I6P28
IJET-V2I6P28
 
E018132735
E018132735E018132735
E018132735
 
Fine–grained analysis and profiling of software bugs to facilitate waste iden...
Fine–grained analysis and profiling of software bugs to facilitate waste iden...Fine–grained analysis and profiling of software bugs to facilitate waste iden...
Fine–grained analysis and profiling of software bugs to facilitate waste iden...
 
Survey on Software Data Reduction Techniques Accomplishing Bug Triage
Survey on Software Data Reduction Techniques Accomplishing Bug TriageSurvey on Software Data Reduction Techniques Accomplishing Bug Triage
Survey on Software Data Reduction Techniques Accomplishing Bug Triage
 
A NOVEL APPROACH TO ERROR DETECTION AND CORRECTION OF C PROGRAMS USING MACHIN...
A NOVEL APPROACH TO ERROR DETECTION AND CORRECTION OF C PROGRAMS USING MACHIN...A NOVEL APPROACH TO ERROR DETECTION AND CORRECTION OF C PROGRAMS USING MACHIN...
A NOVEL APPROACH TO ERROR DETECTION AND CORRECTION OF C PROGRAMS USING MACHIN...
 
Software Defect Trend Forecasting In Open Source Projects using A Univariate ...
Software Defect Trend Forecasting In Open Source Projects using A Univariate ...Software Defect Trend Forecasting In Open Source Projects using A Univariate ...
Software Defect Trend Forecasting In Open Source Projects using A Univariate ...
 
survey on analysing the crash reports of software applications
survey on analysing the crash reports of software applicationssurvey on analysing the crash reports of software applications
survey on analysing the crash reports of software applications
 
Determination of Software Release Instant of Three-Tier Client Server Softwar...
Determination of Software Release Instant of Three-Tier Client Server Softwar...Determination of Software Release Instant of Three-Tier Client Server Softwar...
Determination of Software Release Instant of Three-Tier Client Server Softwar...
 
A Review on Parameter Estimation Techniques of Software Reliability Growth Mo...
A Review on Parameter Estimation Techniques of Software Reliability Growth Mo...A Review on Parameter Estimation Techniques of Software Reliability Growth Mo...
A Review on Parameter Estimation Techniques of Software Reliability Growth Mo...
 
ENSEMBLE REGRESSION MODELS FOR SOFTWARE DEVELOPMENT EFFORT ESTIMATION: A COMP...
ENSEMBLE REGRESSION MODELS FOR SOFTWARE DEVELOPMENT EFFORT ESTIMATION: A COMP...ENSEMBLE REGRESSION MODELS FOR SOFTWARE DEVELOPMENT EFFORT ESTIMATION: A COMP...
ENSEMBLE REGRESSION MODELS FOR SOFTWARE DEVELOPMENT EFFORT ESTIMATION: A COMP...
 
Towards effective bug triage with software
Towards effective bug triage with softwareTowards effective bug triage with software
Towards effective bug triage with software
 

Plus de iaemedu

Tech transfer making it as a risk free approach in pharmaceutical and biotech in
Tech transfer making it as a risk free approach in pharmaceutical and biotech inTech transfer making it as a risk free approach in pharmaceutical and biotech in
Tech transfer making it as a risk free approach in pharmaceutical and biotech iniaemedu
 
Integration of feature sets with machine learning techniques
Integration of feature sets with machine learning techniquesIntegration of feature sets with machine learning techniques
Integration of feature sets with machine learning techniquesiaemedu
 
Effective broadcasting in mobile ad hoc networks using grid
Effective broadcasting in mobile ad hoc networks using gridEffective broadcasting in mobile ad hoc networks using grid
Effective broadcasting in mobile ad hoc networks using gridiaemedu
 
Effect of scenario environment on the performance of mane ts routing
Effect of scenario environment on the performance of mane ts routingEffect of scenario environment on the performance of mane ts routing
Effect of scenario environment on the performance of mane ts routingiaemedu
 
Adaptive job scheduling with load balancing for workflow application
Adaptive job scheduling with load balancing for workflow applicationAdaptive job scheduling with load balancing for workflow application
Adaptive job scheduling with load balancing for workflow applicationiaemedu
 
Survey on transaction reordering
Survey on transaction reorderingSurvey on transaction reordering
Survey on transaction reorderingiaemedu
 
Semantic web services and its challenges
Semantic web services and its challengesSemantic web services and its challenges
Semantic web services and its challengesiaemedu
 
Website based patent information searching mechanism
Website based patent information searching mechanismWebsite based patent information searching mechanism
Website based patent information searching mechanismiaemedu
 
Revisiting the experiment on detecting of replay and message modification
Revisiting the experiment on detecting of replay and message modificationRevisiting the experiment on detecting of replay and message modification
Revisiting the experiment on detecting of replay and message modificationiaemedu
 
Prediction of customer behavior using cma
Prediction of customer behavior using cmaPrediction of customer behavior using cma
Prediction of customer behavior using cmaiaemedu
 
Performance analysis of manet routing protocol in presence
Performance analysis of manet routing protocol in presencePerformance analysis of manet routing protocol in presence
Performance analysis of manet routing protocol in presenceiaemedu
 
Performance measurement of different requirements engineering
Performance measurement of different requirements engineeringPerformance measurement of different requirements engineering
Performance measurement of different requirements engineeringiaemedu
 
Mobile safety systems for automobiles
Mobile safety systems for automobilesMobile safety systems for automobiles
Mobile safety systems for automobilesiaemedu
 
Efficient text compression using special character replacement
Efficient text compression using special character replacementEfficient text compression using special character replacement
Efficient text compression using special character replacementiaemedu
 
Agile programming a new approach
Agile programming a new approachAgile programming a new approach
Agile programming a new approachiaemedu
 
Adaptive load balancing techniques in global scale grid environment
Adaptive load balancing techniques in global scale grid environmentAdaptive load balancing techniques in global scale grid environment
Adaptive load balancing techniques in global scale grid environmentiaemedu
 
A survey on the performance of job scheduling in workflow application
A survey on the performance of job scheduling in workflow applicationA survey on the performance of job scheduling in workflow application
A survey on the performance of job scheduling in workflow applicationiaemedu
 
A survey of mitigating routing misbehavior in mobile ad hoc networks
A survey of mitigating routing misbehavior in mobile ad hoc networksA survey of mitigating routing misbehavior in mobile ad hoc networks
A survey of mitigating routing misbehavior in mobile ad hoc networksiaemedu
 
A novel approach for satellite imagery storage by classify
A novel approach for satellite imagery storage by classifyA novel approach for satellite imagery storage by classify
A novel approach for satellite imagery storage by classifyiaemedu
 
A self recovery approach using halftone images for medical imagery
A self recovery approach using halftone images for medical imageryA self recovery approach using halftone images for medical imagery
A self recovery approach using halftone images for medical imageryiaemedu
 

Plus de iaemedu (20)

Tech transfer making it as a risk free approach in pharmaceutical and biotech in
Tech transfer making it as a risk free approach in pharmaceutical and biotech inTech transfer making it as a risk free approach in pharmaceutical and biotech in
Tech transfer making it as a risk free approach in pharmaceutical and biotech in
 
Integration of feature sets with machine learning techniques
Integration of feature sets with machine learning techniquesIntegration of feature sets with machine learning techniques
Integration of feature sets with machine learning techniques
 
Effective broadcasting in mobile ad hoc networks using grid
Effective broadcasting in mobile ad hoc networks using gridEffective broadcasting in mobile ad hoc networks using grid
Effective broadcasting in mobile ad hoc networks using grid
 
Effect of scenario environment on the performance of mane ts routing
Effect of scenario environment on the performance of mane ts routingEffect of scenario environment on the performance of mane ts routing
Effect of scenario environment on the performance of mane ts routing
 
Adaptive job scheduling with load balancing for workflow application
Adaptive job scheduling with load balancing for workflow applicationAdaptive job scheduling with load balancing for workflow application
Adaptive job scheduling with load balancing for workflow application
 
Survey on transaction reordering
Survey on transaction reorderingSurvey on transaction reordering
Survey on transaction reordering
 
Semantic web services and its challenges
Semantic web services and its challengesSemantic web services and its challenges
Semantic web services and its challenges
 
Website based patent information searching mechanism
Website based patent information searching mechanismWebsite based patent information searching mechanism
Website based patent information searching mechanism
 
Revisiting the experiment on detecting of replay and message modification
Revisiting the experiment on detecting of replay and message modificationRevisiting the experiment on detecting of replay and message modification
Revisiting the experiment on detecting of replay and message modification
 
Prediction of customer behavior using cma
Prediction of customer behavior using cmaPrediction of customer behavior using cma
Prediction of customer behavior using cma
 
Performance analysis of manet routing protocol in presence
Performance analysis of manet routing protocol in presencePerformance analysis of manet routing protocol in presence
Performance analysis of manet routing protocol in presence
 
Performance measurement of different requirements engineering
Performance measurement of different requirements engineeringPerformance measurement of different requirements engineering
Performance measurement of different requirements engineering
 
Mobile safety systems for automobiles
Mobile safety systems for automobilesMobile safety systems for automobiles
Mobile safety systems for automobiles
 
Efficient text compression using special character replacement
Efficient text compression using special character replacementEfficient text compression using special character replacement
Efficient text compression using special character replacement
 
Agile programming a new approach
Agile programming a new approachAgile programming a new approach
Agile programming a new approach
 
Adaptive load balancing techniques in global scale grid environment
Adaptive load balancing techniques in global scale grid environmentAdaptive load balancing techniques in global scale grid environment
Adaptive load balancing techniques in global scale grid environment
 
A survey on the performance of job scheduling in workflow application
A survey on the performance of job scheduling in workflow applicationA survey on the performance of job scheduling in workflow application
A survey on the performance of job scheduling in workflow application
 
A survey of mitigating routing misbehavior in mobile ad hoc networks
A survey of mitigating routing misbehavior in mobile ad hoc networksA survey of mitigating routing misbehavior in mobile ad hoc networks
A survey of mitigating routing misbehavior in mobile ad hoc networks
 
A novel approach for satellite imagery storage by classify
A novel approach for satellite imagery storage by classifyA novel approach for satellite imagery storage by classify
A novel approach for satellite imagery storage by classify
 
A self recovery approach using halftone images for medical imagery
A self recovery approach using halftone images for medical imageryA self recovery approach using halftone images for medical imagery
A self recovery approach using halftone images for medical imagery
 

Art of software defect association & correction using association

  • 1. International Journal of Computer and Technology (IJCET), ISSN 0976 – 6367(Print), International Journal of Computer Engineering Engineering ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME and Technology (IJCET), ISSN 0976 – 6367(Print) ISSN 0976 – 6375(Online) Volume 1 IJCET Number 1, May - June (2010), pp. 261-271 ©IAEME © IAEME, http://www.iaeme.com/ijcet.html ART OF SOFTWARE DEFECT ASSOCIATION & CORRECTION USING ASSOCIATION RULE MINING Nilamadhab Mishra, MCA, M.PHIL (C S), M.Tech (CSE) Asst.prof, Krupajal Group of Institution, Orissa E-mail : nilamadhab.mishra@rediffmail.com ABSTRACT: Much current software defect prediction work focuses on the number of defects remaining in a software system. In this paper, I focus about association rule mining based methods to predict defect associations and defect correction effort. This is to help developers detect software defects and assist project managers in allocating testing resources more effectively. Software process improvement method is based on the analysis of defect data. I applied the proposed methods to the SEL defect data consisting of more than 200 projects over more than 15 years. The method is classification of software defects using attributes which relate defects to specific process activities. The results show that, for defect association prediction, the accuracy is very high and the false-negative rate is very low. The comparison of defect correction effort prediction method with other types of methods—PART, C4.5, and Naı¨ve Bays show that accuracy can be improved. I also evaluate the impact of support and confidence levels on prediction accuracy, false negative rate, false-positive rate, and the number of rules. Thus higher support and confidence levels may not result in higher prediction accuracy and a sufficient number of rules must be a precondition for high prediction accuracy. Key words: Defect association, defect correction effort, defect isolation effort, defect association prediction, Software defect prediction 1. INTRODUCTION The success of a software system depends not only on cost and schedule, but also on quality. Among many software quality characteristics, residual defects have become the de facto industry standard [1, 2]. Therefore, the prediction of software defects, i.e., 261
  • 2. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print), ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME deviations from specifications or expectations which might lead to failures in operation, has been an important research topic in the field of software engineering for more than 30 years. Clearly, they are a proxy for reliability, but, unfortunately, reliability is extremely difficult to assess prior to full deployment. Current defect prediction work focuses on estimating the number of defects remaining in software systems with code metrics, inspection data, and process-quality data by statistical approaches, capture-recapture (CR) models, and detection profile methods (DPM)[4 ]. The prediction result, which is the number of defects remaining in a software system, can be used as an important measure for the software developer, and can be used to control the software process and gauge the likely delivered quality of a software system. In contrast, propose that the defects found during production are a manifestation of process deficiencies, so they present a case study of the use of a defect based method for software in-process improvement. In particular, they use an attribute-focusing (AF) method to discover associations among defect attributes such as defect type, source, and phase introduced, phase found, component, impact, etc [9]. By finding out the event that could have led to the associations, they identify a process problem and implement a corrective action. This can lead a project team to improve its process during development. I restrict my work to the predictions of defect (type) associations and corresponding defect correction effort. 1. For the given defect(s), what other defect(s) may co-occur? 2. In order to correct the defect(s), how much effort will be consumed? I use defect type data to predict software defect associations that are the relations among different defect types such as: If defects a and b occur, then defect c also will occur. This is formally written as a ^ b => c. The defect Associations can be used for three purposes [3, 4]. First, find as many related defects as possible to the detected defect(s) and, consequently, make more-effective corrections to the software. For example, consider the situation where we have classes of defect a, b, and c and suppose the rule a ^ b => c has been obtained from a historical data set, and the defects of class a and b have been detected occurring together, but no defect of class c has yet been discovered. The rule indicates that a defect of class c is likely to have occurred as well and indicates that we 262
  • 3. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print), ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME should check the corresponding software artifact to see whether or not such a defect really exists. If the result is positive, the search can be continued, if rule a ^ b ^ c =>d holds as well, we can do the same thing for defect d. Second, help evaluate reviewers’ results during an inspection. For example, if rule a ^ b => c holds but a reviewer has only found defects a and b, it is possible that he missed defect c. Thus, a recommendation might be that his/her work should be reinserted for completeness. Third, to assist managers in improving the software process through analysis of the reasons some defects Frequently occur together. If the analysis leads to the identification of a process problem, managers have to come up with a corrective action. At the same time, for each of the associated defects, we also predict the likely effort required to isolate and correct it. This can be used to help project managers improve control of project schedules. Both of our defect association prediction and defect correction effort prediction methods are based on the association rule mining method which was first explored by Agrawal. Association rule mining aims to discover the patterns of co-occurrences of the attributes in a database [5, 6]. However, it must be stressed that associations do not imply causality. An association rule is an expression A-- >C, where A (Antecedent) and C (Consequent) are sets of items. The meaning of such rules is quite intuitive: Given a database D of transactions, where each transaction T 2D is a set of items, A-->C expresses that whenever a transaction T contains A, then T also contains C with a specified confidence. The rule confidence is defined as the percentage of transactions containing C in addition to A with regard to the overall number of transactions containing A. The idea of mining association rules originates from the analysis of market-basket data where rules like “customers who buy products p1 and p2 will also buy product p3 with probability c percent” are extracted. Their direct applicability to business problems together with their inherent understandability make association rule mining a popular data mining method. 263
  • 4. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print), ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME 2. PAPER CONTENTS General method: The objective of the study is to discover software defect associations from historical software engineering data sets, and help determine whether or not a defect(s) is accompanied by other defect(s). If so, we attempt to determine what these defects are and how much effort might be expected to be used when we correct them. Finally, we aim to help detect software defects and effectively improve software control. For this purpose, first, we preprocessed the NASA SEL data set (see the following section for details), and obtained three data sets: the defect data set, the defect isolation effort data set, and the defect correction effort data set. Then, for each of these data sets, we randomly extracted five pairs of training and test data sets as the basis of the research. After that, we used association rule mining based methods to discover defect associations and the patterns between a defect and the corresponding defect isolation/correction effort. Finally, we predicted the defect(s) attached to the given defect(s) and the effort used to correct each of these defects. We also compared the results with alternative methods where applicable. Data Source and Data Extraction: The data we used is SEL Data which is a subset of the online database created by the NASA/GSFC Software Engineering Laboratory (SEL) for storage and retrieval of software engineering data for NASA Goddard Space Flight Center. The subset includes defect data of more than 200 projects completed over more than 15 years. The SEL Data is a database consisting of 15 tables that provide data on the projects’ software characteristics (overall and at a component level), changes and errors during all phases of development, and effort and computer resources used. In the SEL Data, the defects are divided into six types, Table 1 contains the details. In addition, the effort used to correct defects falls into four categories: One Hour or Less, One Hour to One Day, One Day to Three Days, and More Than Three Days. Analysis approach: I use the five-fold cross-validation method as the overall analysis approach. That is, for each D of the defect data set, the defect isolation effort data set, and the defect correction effort data set, the inducer is trained and tested a total of five times. I use the 264
  • 5. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print), ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME association rule mining method to learn rules from the training data sets. For defect association prediction, the rule learning is straightforward, while for defect correction effort prediction, it is more complicated because the consequent of a rule has to be defect correction effort [7, 8]. Considering the target of association rule mining is not predetermined and classification rule mining has only one predetermined target, the class, I integrate these two techniques to learn defect correction effort prediction rules by focusing on a special subset of association rules whose consequents are restricted to the special attribute, the effort. Once we obtain the rules, we rank them, and use them to predict the defect associations and defect correction effort with the corresponding test data sets. The predictions of defect associations and defect correction effort are both based on the length-first (in terms of the number of items in a rule) strategy. Association Rule Discovery: Both positive and negative association rules are to be discovered by the project. First, the frequent item sets of potential interest and infrequent item sets of potential interest are generated in a database. Then Extract positive rules of the form A -->B, and negative rules of the forms A-->¬B, ¬A-->B, and ¬A ¬B. Rule ranking Strategy: Before prediction, we rank the discovered rules according to the length-first strategy. The length-first strategy was used for two reasons. First, for the defect association prediction, the length-first strategy enables us to find out as many defects as possible that coincide with known defect(s), thus preventing errors due to incomplete discovery of defect associations. Second, for the defect correction effort prediction, the length-first strategy enables us to obtain more-accurate rules, thus improving the effort prediction accuracy. Specifically, the length-first rule-ranking strategy is as follows: 1. Rank rules according to their length. The longer the rules, the higher the priority. 2. If two rules have the same length, rank them according to their confidence values. The greater the confidence values, the higher the priority. The more-confident rules have more predictive power in terms of accuracy; thus, they should have higher priority. 265
  • 6. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print), ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME 3. If two rules have the same confidence values, rank them according to their support values. The higher the support values, the higher the priority. The rules with higher support value are more statistically Significant, so they should have higher priority. 4. If two rules have the same support value, rank them in alphabetical order. 2.1 GRAPHS, TABLES, AND PHOTOGRAPHS Defect Types: Here we are concentrated on some specified defects that are the most frequent defects that may occur in the Sel data. Defect Types: 266
  • 7. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print), ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME Rule Ranking Algorithm: Defect Isolation and Correction Effort Prediction: The constraint-based association rules mining method is used for the purpose of defect isolation and correction prediction [10]. The steps are 1. Compute the frequent item sets that occur together in the training data set at least as frequently as a predetermined min: Supp. The item sets mined must also contain the effort labels 2. Generate association rules from the frequent item sets, where the consequent of the rules is the effort. In addition to the min: Supp threshold, these rules must also satisfy a minimum confidence. 267
  • 8. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print), ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME Effort Prediction Rules: Defect association prediction procedure. We use the association rule mining method to learn rules from the training data sets. For defect association prediction, the rule learning is straightforward, while for defect correction effort prediction, it is more complicated because the consequent of a 268
  • 9. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print), ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME rule has to be defect correction effort. Considering the target of association rule mining is not predetermined and classification rule mining has only one predetermined target, the class, we integrate these two techniques to learn defect correction effort prediction rules by focusing on a special subset of association rules whose consequents are restricted to the special attribute, the effort. Once we obtain the rules, we rank them, and use them to predict the defect associations and defect correction effort with the corresponding test data sets. The predictions of defect associations and defect correction effort are both based on the length-first strategy. There is no work on software defect association prediction and there is no other method that can be used for this purpose. Therefore, I am unable to directly compare our defect association prediction method with other studies. As the defect correction effort is represented in the form of categorical values and there are some attributes to characterize it, it can be viewed as a classification problem. This allows us to compare our defect correction prediction method with three different types of methods. These methods are the Bayesian rule of conditional probability-based method, Naı¨ve Bayes [, the well- known trees-based method, C4.5, and the simple and effective rules-based method, PART. 3. CONCLUSION In this paper, I have given an application of association rule mining to predict software defect associations and defect correction effort with SEL defect data. This is important in order to help developers detect software defects and project managers improve software control and allocate their testing resources effectively. The ideas have been tested using the NASA SEL defect data set. From this, I extracted defect data and the corresponding defect isolation and correction effort data. I have also compared the defect correction effort prediction method with three other types of machine learning methods, namely, PART, C4.5, and Naı¨ve Bayes. The results show that for defect isolation effort, our accuracy is higher than for the other three methods b. Likewise, for defect correction effort prediction, the accuracy is higher than the other three methods. 269
  • 10. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print), ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME I also have explored the impact of support and confidence levels on prediction accuracy, false negative rate, false positive rate, and the number of rules as well. I found that higher support and confidence levels may not result in higher prediction accuracy, and a sufficient number of rules is a precondition for high prediction accuracy. While we do not wish to draw strong conclusions from a single data set study, i believe that my results suggest that association rule mining may be an attractive technique to the software engineering community due to its relative simplicity, transparency, and seeming effectiveness in constructing prediction systems. REFERENCES: [1]. Inderpal S. Bhandari, M. J. Halliday, E. Tarver, D. Brown, J. Chaar, and R. Chillarence. A Case Study of Software Process Improvement during Development. IEEE Trans. on Soft. Eng., 19(12), pp. 1157-1170, December 1993. [2] Inderpal S. Bhandari, M. J. Halliday, J. Chaar, R. Chillarence, K. Jones, J. S. Atkinson, C. Lepori-Costello, P. Y. Jasper, E. D. Tarver, C. C. Lewis, and M. Yonezawa. In-Process Improvement through Defect Data Interpretation. IBM Syst. Journal, 33(1), pp. 182-214, 1994. [3] Xindong Wu, Chengqi Zhang, Shichao Zhang,” Efficient Mining of both positive and negative association rules”, ACM Transactions on Information Systems, Vol. 22, No. 3, July 2004, Pages 381–405. [4] Qinbao Song, Martin Shepperd, Michelle Cartwright, Carolyn Mair, "Software Defect Association Mining and Defect Correction Effort Prediction," IEEE Transactions on Software Engineering, vol. 32, no. 2, pp. 69-82, Feb., 2006. [5] I.S. Bhandari, M.J. Halliday, J. Chaar, R. Chillarenge, K. Jones, J.S. Atkinson, C. Lepori-Costello, P.Y. Jasper, E.D. Tarver, C.C. Lewis, and M. Yonezawa, “In Process Improvement through Defect Data Interpretation,” IBM System J., vol. 33, no. 1, pp. 182-214, 1994. [6] L.C. Briand, K. El-Emam, B.G. Freimut, and O. Laitenberger, “A Comprehensive Evaluation of Capture-Recapture Models forEstimating Software Defect Content,” IEEE Trans. Software Eng., vol. 26, no. 6, pp. 518-540, June 2000. 270
  • 11. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print), ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME [7] T. Compton and C. Withrow, “Prediction and Control of Ada Software Defects,” J. Systems and Software, vol. 12, pp. 199-207, 1990. [8] F. Brito e Abreu, Melo, W., "Evaluating the Impact of Object-Oriented Design on Software Quality", Proceedings of Third International Software Metrics Symposium, pp. 90- 99, 1996. [9] S. R. Chidamber and C. F. Kemerer, "A Metrics Suite for Object Oriented Design", IEEE Transactions on Software Engineering, 20(6), pp. 476-493, 1994. [10] D. ubrani , Murphy, G.C., "Hipikat: recommending pertinent software development artifacts", Proceedings of International Conference on Software Engineering, pp. 408- 418, 2003. 271