Abstract

Heterogeneous defect prediction (HDP) aims to transfer informative knowledge, namely the defect-proneness tendency of software metrics, from a source project to predict potential defects in a target project by matching metrics with similar distributions between different software projects. Nevertheless, the complex internal intrinsic structure hidden behind the defect data makes it difficult for the prior heterogeneous defect models to capture and migrate the most informative software metrics, and severely hinders HDP performance. To address these issues, we propose a robust data-driven HDP model called IVKMP in this study. We firstly adopt an advanced deep generation network – InfoGAN (Information maximizing GANs) for data augmentation, namely simultaneously achieving class balance and generating sufficient defect instances. Secondly, the multi-objective VaEA (Vector angle-based Evolutionary Algorithm) optimization is employed to select the fewest representative metric subsets while achieving the minimum error. Finally, a deep defect predictor for HDP based on the lightweight but effective deep network – PCANet (Principal Component Analysis Network) with the binary hashing and block-wise histogram is built to essentially capture more semantically related robust representations. We compare the IVKMP model with multiple state-of-the-art baseline models across 542 heterogeneous project pairs of 26 software projects. Experimental results demonstrate the superiority and robustness of our IVKMP model.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call