Empirical Investigation of Code and Process Metrics for Defect Prediction

Wenjing Han,Chung-Horng Lung,Samuel A Ajila

doi:10.1109/bigmm.2016.36

Wenjing Han, Chung-Horng Lung + Show 1 more

https://doi.org/10.1109/bigmm.2016.36

Copy DOI

Export

Save

Cite

Publication Date: Apr 1, 2016

Citations: 5

Affiliation: Carleton University

Abstract
Full-Text
Similar Papers

Abstract

Listen

Data science is becoming more important for software engineering problems. Software defect prediction is a critical area which can help the development team allocate test resource efficiently and better understand the root cause of defects. Furthermore, it can help find the reason why a component or even a project is failure-prone. This paper deals with binary classification in predicting if a software component has a bug by using three widely used machine learning algorithms: Random Forest (RF), Neural Networks (NN), and Support Vector Machine (SVM). The paper investigates the applications of these algorithms to the challenging issue of predicting defects in software components. This paper combines code metrics and process metrics as indicators for the Eclipse environment using the aforementioned three algorithms for a sample of weekly Eclipse features. Feature reduction is also adopted using General Linear Model (GLM) to save computational time. The results confirm the predictive capabilities of using two features -- NBD_max and Pre-defects -- and are comparable to the results of using all 61 features. Additionally, this paper evaluates the performance of the three algorithms. NN and RF turn out to have the best fit.

Full Text