Improving Cross-Company Defect Prediction with Data Filtering

Xiao Yu,Jin Liu,Xingyu Peng,Weiqiang Peng

doi:10.1142/s0218194017400046

Abstract

Defect prediction aims to estimate software reliability via learning from historical defect data. Cross-company defect prediction (CCDP) is a practical way that trains a prediction model by exploiting one or multiple projects of a source company and then applies the model to the target company. Unfortunately, larger irrelevant cross-company (CC) data usually makes it difficult to build a CCDP model with high performance. To address such issues, this paper proposes a data filtering method based on agglomerative clustering (DFAC) for CCDP. First, DFAC combines within-company (WC) instances and CC instances and uses agglomerative clustering algorithm to group these instances. Second, DFAC selects subclusters which consist of at least one WC instance, and collects the CC instances in the selected subclusters into a new CC data. Compared with existing data filter methods, the experiment results from 15 public PROMISE datasets show that DFAC increases the pd value, reduces the pf value and achieves higher [Formula: see text]-measure value.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Improving Cross-Company Defect Prediction with Data Filtering

Abstract

Talk to us

Similar Papers

More From: International Journal of Software Engineering and Knowledge Engineering

Lead the way for us

Journal: International Journal of Software Engineering and Knowledge Engineering	Publication Date: Nov 1, 2017
Citations: 13

Similar Papers

A Multi-Source TrAdaBoost Approach for Cross-Company Defect Prediction
Xiao Yu ... Guoping Nie
-
Xiao Yu, et. al.Xiao Yu ... Guoping Nie
01 Jul 2016
01 Jul 2016

Object-Oriented Metrics for Defect Prediction
Satwinder Singh ... Rozy Singla
-
Satwinder Singh, et. al.Satwinder Singh ... Rozy Singla
13 Jun 2018
13 Jun 2018

Cross-company defect prediction via semi-supervised clustering-based data filtering and MSTrA-based transfer learning
Xiao Yu ... Yiheng Jian
Soft Computing | VOL. 22
Xiao Yu, et. al.Xiao Yu ... Yiheng Jian
08 Mar 2018
Soft Computing | VOL. 22

On the relative value of cross-company and within-company data for defect prediction
Burak Turhan ... Tim Menzies
Empirical Software Engineering | VOL. 14
Burak Turhan, et. al.Burak Turhan ... Tim Menzies
07 Jan 2009
Empirical Software Engineering | VOL. 14

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Improving Cross-Company Defect Prediction with Data Filtering

Abstract

Talk to us

Similar Papers

More From: International Journal of Software Engineering and Knowledge Engineering