Abstract

ContextSoftware defect prediction is important for identification of defect-prone parts of a software. Defect prediction models can be developed using software metrics in combination with defect data for predicting defective classes. Various studies have been conducted to find the relationship between software metrics and defect proneness, but there are few studies that statistically determine the effectiveness of the results. ObjectiveThe main objectives of the study are (i) comparison of the machine-learning techniques using data sets obtained from popular open source software (ii) use of appropriate performance measures for measuring the performance of defect prediction models (iii) use of statistical tests for effective comparison of machine-learning techniques and (iv) validation of models over different releases of data sets. MethodIn this study we use object-oriented metrics for predicting defective classes using 18 machine-learning techniques. The proposed framework has been applied to seven application packages of well known, widely used Android operating system viz. Contact, MMS, Bluetooth, Email, Calendar, Gallery2 and Telephony. The results are validated using 10-fold and inter-release validation methods. The reliability and significance of the results are evaluated using statistical test and post-hoc analysis. ResultsThe results show that the area under the curve measure for Naïve Bayes, LogitBoost and Multilayer Perceptron is above 0.7 in most of the cases. The results also depict that the difference between the ML techniques is statistically significant. However, it is also proved that the Support Vector Machines based techniques such as Support Vector Machines and voted perceptron do not possess the predictive capability for predicting defects. ConclusionThe results confirm the predictive capability of various ML techniques for developing defect prediction models. The results also confirm the superiority of one ML technique over the other ML techniques. Thus, the software engineers can use the results obtained from this study in the early phases of the software development for identifying defect-prone classes of given software.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call