Abstract

Cross-project defect prediction (CPDP) technology refers to the constructing prediction model to predict the instance label of the target project by utilising labelled data from an external project. The challenge of CPDP methods is the distribution difference between the data from different projects. Transfer learning can transfer the knowledge from the source domain to the target domain with the aim to minimise the domain difference between different domains. However, most existing methods reduce the distribution discrepancy in the original feature space, where the features are high-dimensional and non-linear, which makes it hard to reduce the distribution distance between different projects. Moreover, previous works mainly consider marginal distribution or conditional distribution difference. In this study, the authors proposed a manifold embedded distribution adaptation (MDA) approach to narrow the distribution gap in manifold feature subspace. MDA maps source and target project data to manifold subspace and then joint distribution adaptation of conditional and marginal distributions is performed on manifold subspace. To evaluate the effectiveness of MDA, the authors perform extensive experiments on 20 public projects with three indicators. The experiment results show that MDA improves the average performance, but the improvement is not statistically significant in comparison to HYDRA (one of the baselines).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call