Abstract

AbstractIn defect prediction, a high false‐positive rate (FPR) caused by class imbalance not only increases the workload of testing and development but also consumes unnecessary costs. Many defect models against class imbalance have been proposed to improve the accuracy of defect prediction, but their ability to reduce FPR is unclear. To solve these problems, we first proposed a BayesNet with adjustable weights, called WBN, to reduce the FPR in software defect prediction, which is an algorithm independent of data preprocessing techniques. The mechanism of our WBN is to change the sampling probability of the misclassified instances when training the defect model, making the BayesNet model focus more on false alarm instances. And then, we investigate the FPR of five mainstream defect models for solving class imbalance and select them as comparison models to test the validity of our methods. The experimental result on eight open‐source projects shows that a) our WBN, in in‐version defect prediction (IVDP) and cross‐version defect prediction (CVDP), effectively reduces FPR with means of 0.384 and 0.322, respectively; b) compared with improved subclass discriminant analysis (ISDA) that is the lowest FPR in all control models, our WBN not only reduced the FPR but maintained recall whose mean value was 0.797, whereas ISDA did not, with an average recall of only 0.397; c) our WBN, in CVDP, not only reduces FPR, but also has significant superiority over five control defect models and baseline. Besides, we also found that the class imbalance difference between the test set and the training set has an impact on CVDP performance, recommending that practitioners choose the best dataset for CVDP from the defect data of the historical version through special technology.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call