Abstract

The Software Defect Prediction (SDP) process provides reliable software by identifying defect-prone modules before the testing stage. It efficiently and effectively utilizes quality assurance resources. Most predictive models are trained on historical data which belong to the same project or comparable project. These models show satisfactory performance as they utilize similar settings to the considered projects. But the limitation of these models is that they are effective only if there are adequate historical data to train a predictive model. In reality, however, such historical data are minimal for some projects and absent for new projects. The defect prediction in such projects which lack historical data can be accomplished by training prediction models on different project data. This process is known as Cross-Project Defect Prediction (CPDP). Software defect datasets also suffered from class imbalance issues which further degrades the model’s performance. In this research work, the authors have proposed a Multi-Objective Random Forest (MO-RF) algorithm with a data resampling technique to minimize the probability of false alarms, to maximize the probability of detection and to overcome the class imbalance problem. The study also evaluates the performance of other prediction models. The proposed method has shown percentage improvement (in terms of AUC) of 2.78 and 3.46 over MONB and MONBNN, respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call