Effort-Aware Defect Prediction (EADP) technique sorts software modules by the defect density and aims to find more bugs when testing a certain number of Lines of Code (LOC). The existing EADP methods ignore the number of required inspected modules and thus resulting in more testing cost. Therefore, we propose a multi-objective effort-aware defect prediction approach based on NSGA-II named MOOAC for EADP, which aims to maximize the Proportion of the found Bugs (PofB@20%) and minimize the Proportion of Module Inspected (PMI@20%) when inspecting the top 20% LOC. MOOAC firstly trains a random forest classification model. Then, it builds a logistic regression model, and utilizes the NSGA-II algorithm to generate the coefficient vector of the model by maximizing the PofB@20% value and minimizing the PMI@20% value simultaneously. In the model prediction phase, MOOAC firstly employs the built random forest classifier to decide whether modules are defective. Next, the predicted defective modules are first inspected based on the ratio between the predicted defect probability by the logistic regression model and LOC, which can make testers to find more bugs and test as fewer LOC as possible. The clean modules are then inspected to reduce the Initial False Alarms (IFA), if there is still the testing budget left. The results show that MOOAC exhibits the best overall performance on the PofB@20% and PMI@20%. In other words, MOOAC enables testers to identify more bugs per 1% module.
Read full abstract