Abstract

Cross-Project Defect Prediction (CPDP) refers to transferring knowledge from source software projects to a target software project. Previous research has shown that the impacts of knowledge transferred from different source projects differ on the target task. Therefore, one of the fundamental challenges in CPDP is how to measure the amount of knowledge transferred from each source project to the target task. This article proposed a novel CPDP method called Multi-source defect prediction with Joint Wasserstein Distance and Ensemble Learning (MJWDEL) to learn transferred weights for evaluating the importance of each source project to the target task. In particular, first of all, applying the TCA technique and Logistic Regression (LR) train a sub-model for each source project and the target project. Moreover, the article designs joint Wassertein distance to understand the source-target relationship and then uses this as a basis to compute the transferred weights of different sub-models. After that, the transferred weights can be used to reweight these sub-models to determine their importance in knowledge transfer to the target task. We conducted experiments on 19 software projects from PROMISE, NASA and AEEEM datasets. Compared with several state-of-the-art CPDP methods, the proposed method substantially improves CPDP performance in terms of four evaluation indicators (i.e., F-measure, Balance, G-measure and MMC).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call