Improving transfer learning for software cross-project defect prediction

Osayande P Omondiagbe,Stephen G Macdonell,Sherlock A Licorish

doi:10.1007/s10489-024-05459-1

Osayande P Omondiagbe, Stephen G Macdonell + Show 1 more

Open Access

https://doi.org/10.1007/s10489-024-05459-1

Copy DOI

Export

Save

Cite

Journal: Applied Intelligence	Publication Date: Apr 1, 2024
Citations: 2	License type: CC BY 4.0

Abstract
Full-Text
Similar Papers

Abstract

Listen

AbstractSoftware cross-project defect prediction (CPDP) makes use of cross-project (CP) data to overcome the lack of data necessary to train well-performing software defect prediction (SDP) classifiers in the early stage of new software projects. Since the CP data (known as the source) may be different from the new project’s data (known as the target), this makes it difficult for CPDP classifiers to perform well. In particular, it is a mismatch of data distributions between source and target that creates this difficulty. Transfer learning-based CPDP classifiers are designed to minimize these distribution differences. The first Transfer learning-based CPDP classifiers treated these differences equally, thereby degrading prediction performance. To this end, recent research has the Weighted Balanced Distribution Adaptation (W-BDA) method to leverage the importance of both distribution differences to improve classification performance. Although W-BDA has been shown to improve model performance in CPDP and tackle the class imbalance by balancing the class proportion of each domain, research to date has failed to consider model performance in light of increasing target data. We provide the first investigation studying the effects of increasing the target data when leveraging the importance of both distribution differences. We extend the initial W-BDA method and call this extension the W-BDA$$\mathbf {^{+}}$$ + method. To evaluate the effectiveness of W-BDA$$\mathbf {^{+}}$$ + for improving CPDP performance, we conduct eight experiments on 18 projects from four datasets, where data sampling was performed with different sampling methods. Data sampling was only performed on the baseline methods and not on our proposed W-BDA$$\mathbf {^{+}}$$ + and the original W-BDA because data sampling issues do not exist for these two methods. We evaluate our method using four complementary indicators (i.e., Balanced Accuracy, AUC, F-measure and G-Measure). Our findings reveal an average improvement of 6%, 7.5%, 10% and 12% for these four indicators when W-BDA$$\mathbf {^{+}}$$ + is compared to the original W-BDA and five other baseline methods (for all four of the sampling methods used). Also, as the target to source ratio is increased with different sampling methods, we observe a decrease in performance for the original W-BDA, with our W-BDA$$\mathbf {^{+}}$$ + approach outperforming the original W-BDA in most cases. Our results highlight the importance of having an awareness of the effect of the increasing availability of target data in CPDP scenarios when using a method that can handle the class imbalance problem.

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

Improving transfer learning for software cross-project defect prediction

Abstract

Published Version

Talk to us

Similar Papers

More From: Applied Intelligence

Lead the way for us

Similar Papers

Software defect prediction via transfer learning based neural network
Qimeng Cao ... Qinghua Cao
-
Qimeng Cao, et. al.Qimeng Cao ... Qinghua Cao
01 Oct 2015
01 Oct 2015

Boosted Relief Feature Subset Selection and Heterogeneous Cross Project Defect Prediction using Firefly Particle Swarm Optimization
Mrs.N Kalavani* ... Dr.R Beena
International Journal of Recent Technology and Engineering (IJRTE) | VOL. 8
Mrs.N Kalavani*, et. al.Mrs.N Kalavani* ... Dr.R Beena
30 Jan 2020
International Journal of Recent Technology and Engineering (IJRTE) | VOL. 8

Using active learning selection approach for cross-project software defect prediction
Wenbo Mi ... Youren Chen
Connection Science | VOL. 34
Wenbo Mi, et. al.Wenbo Mi ... Youren Chen
02 Jun 2022
Connection Science | VOL. 34

Prediction of Cross Project Defects using Ensemble based Multinomial Classifier
Lipika Goel ... Sunil Khatri
ICST Transactions on Scalable Information Systems | VOL. 0
Lipika Goel, et. al.Lipika Goel ... Sunil Khatri
13 Jul 2018
ICST Transactions on Scalable Information Systems | VOL. 0

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Improving transfer learning for software cross-project defect prediction

Abstract

Published Version

Talk to us

Similar Papers

More From: Applied Intelligence