Abstract

Background: defect prediction model is built using historical data from previous versions/releases of the same project. However, such historical data may not exist in case of newly developed projects. Alternatively, one can train a model using data obtained from external projects. This approach is known as cross-project defect prediction (CPDP). In CPDP, it is still difficult to utilize external projects' data or decide which particular project to use to train a model. Aim: to address this issue, we apply bandit algorithm (BA) to CPDP in order to select the most suitable training project from a set of projects. Method: BA-based prediction iteratively reselects the project after each module is tested, considering the accuracy of the predictions. As baselines, we used simple CPDP methods such as training a model with randomly selected project. All models were built using logistic regression. Results: We experimented our approach on two datasets (NASA and DAMB, with a total of 12 projects). The BA-based defect prediction models resulted in, on average, a higher accuracy (AUC and F1 score) than the baselines. Conclusion: in this preliminarily study, we demonstrate the feasibility of using BA in the context of CPDP. Our initial assessment shows that the use BA for predicting defects in CPDP is promising and may outperform existing approaches.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call