Joint feature representation learning and progressive distribution matching for cross-project defect prediction

Quanyi Zou,Xiaowei Gu,Lu Lu,Shaojian Qiu,Zhanyu Yang

doi:10.1016/j.infsof.2021.106588

Abstract

Cross-Project Defect Prediction (CPDP) aims to leverage the knowledge from label-rich source software projects to promote tasks in a label-poor target software project. Existing CPDP methods have two major flaws. One is that previous CPDP methods only consider global feature representation and ignores local relationship between instances in the same category from different projects, resulting in ambiguous predictions near the decision boundary. The other one is that CPDP methods based on pseudo-labels assume that the conditional distribution can be well matched at one stroke, when instances of target project are correctly annotated pseudo labels. However, due to the great gap between projects, the pseudo-labels seriously deviate from the real labels. To address above issues, this paper proposed a novel CPDP method named Joint Feature Representation with Double Marginalized Denoising Autoencoders (DMDA_JFR). Our method mainly includes two parts: joint feature representation learning and progressive distribution matching. We utilize two novel autoencoders to jointly learn the global and local feature representations simultaneously. To achieve progressive distribution matching, we introduce a repetitious pseudo-labels strategy, which makes it possible that distributions are matched after each stack layer learning rather than in one stroke. The effectiveness of the proposed method was evaluated through experiments conducted on 10 open-source projects, including 29 software releases from PROMISE repository. Overall, experimental results show that our proposed method outperformed several state-of-the-art baseline CPDP methods. It can be concluded that (1) joint deep representations are promising for CPDP compared with only considering global feature representation methods, (2) progressive distribution matching is more effective for adapting probability distributions in CPDP compared with existing CPDP methods based on pseudo-labels. • This paper proposed a novel CPDP method named Joint Feature Representation with Double Marginalized Denoising Autoencoders (DMDA-JFR). • Our method mainly includes two parts: joint feature representation learning and progressive distribution matching. • We utilize two novel autoencoders to jointly learn the global and local feature representations simultaneously. • We introduce a repetitious pseudo-labels strategy to progressively match distribution.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Joint feature representation learning and progressive distribution matching for cross-project defect prediction

Abstract

Talk to us

Similar Papers

More From: Information and Software Technology

Lead the way for us

Journal: Information and Software Technology	Publication Date: Apr 7, 2021
Citations: 22

Similar Papers

Simplify Your Neural Networks: An Empirical Study on Cross-Project Defect Prediction
Ruchika Malhotra ... Abuzar Ahmed Khan
-
Ruchika Malhotra, et. al.Ruchika Malhotra ... Abuzar Ahmed Khan
14 Sep 2021
14 Sep 2021

Improving Cross-Project Defect Prediction Methods with Data Simplification
Sousuke Amasaki ... Tomoyuki Yokogawa
-
Sousuke Amasaki, et. al.Sousuke Amasaki ... Tomoyuki Yokogawa
01 Aug 2015
01 Aug 2015

Multi-source Cross Project Defect Prediction with Joint Wasserstein Distance and Ensemble Learning
Quanyi Zou ... Zhanyu Yang
-
Quanyi Zou, et. al.Quanyi Zou ... Zhanyu Yang
01 Oct 2021
01 Oct 2021

MVSE: Effort-Aware Heterogeneous Defect Prediction via Multiple-View Spectral Embedding
Zhou Xu ... Yong Wang
-
Zhou Xu, et. al.Zhou Xu ... Yong Wang
01 Jul 2019
01 Jul 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Joint feature representation learning and progressive distribution matching for cross-project defect prediction

Abstract

Talk to us

Similar Papers

More From: Information and Software Technology