Heterogeneous defect prediction with two-stage ensemble learning

Zhiqiang Li,Hongyu Zhang,Xiao-Yuan Jing,Xiaoke Zhu,Shi Ying,Baowen Xu

doi:10.1007/s10515-019-00259-1

Abstract

Heterogeneous defect prediction (HDP) refers to predicting defect-prone software modules in one project (target) using heterogeneous data collected from other projects (source). Recently, several HDP methods have been proposed. However, these methods do not sufficiently incorporate the two characteristics of the defect data: (1) data could be linear inseparable, and (2) data could be highly imbalanced. These two data characteristics make it challenging to build an effective HDP model. In this paper, we propose a novel Two-Stage Ensemble Learning (TSEL) approach to HDP, which contains two stages: ensemble multi-kernel domain adaptation (EMDA) stage and ensemble data sampling (EDS) stage. In the EMDA stage, we develop an Ensemble Multiple Kernel Correlation Alignment (EMKCA) predictor, which combines the advantage of multiple kernel learning and domain adaptation techniques. In the EDS stage, we employ RESample with replacement (RES) technique to learn multiple different EMKCA predictors and use average ensemble to combine them together. These two stages create an ensemble of defect predictors. Extensive experiments on 30 public projects show that the proposed TSEL approach outperforms a range of competing methods. The improvement is 20.14–33.92% in AUC, 36.05–54.78% in f-measure, and 5.48–19.93% in balance, respectively.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Heterogeneous defect prediction with two-stage ensemble learning

Abstract

Talk to us

Similar Papers

More From: Automated Software Engineering

Lead the way for us

Journal: Automated Software Engineering	Publication Date: Jun 4, 2019
Citations: 45

Similar Papers

Heterogeneous Defect Prediction Through Multiple Kernel Learning and Ensemble Learning
Zhiqiang Li ... Xiao-Yuan Jing
-
Zhiqiang Li, et. al.Zhiqiang Li ... Xiao-Yuan Jing
01 Sep 2017
01 Sep 2017

Heterogeneous Defect Prediction through Correlation-Based Selection of Multiple Source Projects and Ensemble Learning
Eunseob Kim ... Jongmoon Baik
-
Eunseob Kim, et. al.Eunseob Kim ... Jongmoon Baik
01 Dec 2021
01 Dec 2021

Learning Target Predictive Function without Target Labels
Chun-Wei Seah ... Yew-Soon Ong
-
Chun-Wei Seah, et. al.Chun-Wei Seah ... Yew-Soon Ong
01 Dec 2012
01 Dec 2012

Deep learning-based domain adaptation for a generalized detection of wear phenomena during blanking
Christian Kubik ... Peter Groche
Manufacturing Letters | VOL. 35
Christian Kubik, et. al.Christian Kubik ... Peter Groche
01 Aug 2023
Manufacturing Letters | VOL. 35

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Heterogeneous defect prediction with two-stage ensemble learning

Abstract

Talk to us

Similar Papers

More From: Automated Software Engineering