Abstract

Often, the prior defect data of the same project is unavailable; researchers thought whether the defect data of the other projects can be used for prediction. This made cross project defect prediction an open research issue. In this approach, the training data often suffers from class imbalance problem. Here, the work is directed on homogeneous cross-project defect prediction. A novel ensemble model that will perform in dual fold is proposed. Firstly, it will handle the class imbalance problem of the dataset. Secondly, it will perform the prediction of the target class. For handling the imbalance problem, the training dataset is divided into data frames. Each data frame will be balanced. An ensemble model using the maximum voting of all random forest classifiers is implemented. The proposed model shows better performance in comparison to the other baseline models. Wilcoxon signed rank test is performed for validation of the proposed model.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call