Cross-Project Defect Prediction Using Transfer Learning with Long Short-Term Memory Networks

Hongwei Tao,Xiaoxu Niu,Lianyou Fu,Songtao Shang,Haoran Chen,Yang Xian,Qiaoling Cao

doi:10.1049/2024/5550801

Abstract

With the increasing number of software projects, within-project defect prediction (WPDP) has already been unable to meet the demand, and cross-project defect prediction (CPDP) is playing an increasingly significant role in the area of software engineering. The classic CPDP methods mainly concentrated on applying metric features to predict defects. However, these approaches failed to consider the rich semantic information, which usually contains the relationship between software defects and context. Since traditional methods are unable to exploit this characteristic, their performance is often unsatisfactory. In this paper, a transfer long short-term memory (TLSTM) network model is first proposed. Transfer semantic features are extracted by adding a transfer learning algorithm to the long short-term memory (LSTM) network. Then, the traditional metric features and semantic features are combined for CPDP. First, the abstract syntax trees (AST) are generated based on the source codes. Second, the AST node contents are converted into integer vectors as inputs to the TLSTM model. Then, the semantic features of the program can be extracted by TLSTM. On the other hand, transferable metric features are extracted by transfer component analysis (TCA). Finally, the semantic features and metric features are combined and input into the logical regression (LR) classifier for training. The presented TLSTM model performs better on the f-measure indicator than other machine and deep learning models, according to the outcomes of several open-source projects of the PROMISE repository. The TLSTM model built with a single feature achieves 0.7% and 2.1% improvement on Log4j-1.2 and Xalan-2.7, respectively. When using combined features to train the prediction model, we call this model a transfer long short-term memory for defect prediction (DPTLSTM). DPTLSTM achieves a 2.9% and 5% improvement on Synapse-1.2 and Xerces-1.4.4, respectively. Both prove the superiority of the proposed model on the CPDP task. This is because LSTM capture long-term dependencies in sequence data and extract features that contain source code structure and context information. It can be concluded that: (1) the TLSTM model has the advantage of preserving information, which can better retain the semantic features related to software defects; (2) compared with the CPDP model trained with traditional metric features, the performance of the model can validly enhance by combining semantic features and metric features.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IET Software	Publication Date: Mar 18, 2024
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Cross-Project Defect Prediction Using Transfer Learning with Long Short-Term Memory Networks

Abstract

Talk to us

Similar Papers

More From: IET Software

Lead the way for us

Similar Papers

Seml: A Semantic LSTM Model for Software Defect Prediction
Hongliang Liang ... Zhuosi Xie
IEEE Access | VOL. 7
Hongliang Liang, et. al.Hongliang Liang ... Zhuosi Xie
01 Jan 2019
IEEE Access | VOL. 7

Deep Semantic Feature Learning for Software Defect Prediction
Song Wang ... Jaechang Nam
IEEE Transactions on Software Engineering | VOL. 46
Song Wang, et. al.Song Wang ... Jaechang Nam
01 Dec 2020
IEEE Transactions on Software Engineering | VOL. 46

Cross-Project Defect Prediction Based on Domain Adaptation and LSTM Optimization
Khadija Javed ... Mudasir Ahmad Wani
Algorithms | VOL. 17
Khadija Javed, et. al.Khadija Javed ... Mudasir Ahmad Wani
24 Apr 2024
Algorithms | VOL. 17

Performance Evaluation Model of Corporate Financial Sustainability based on Swarm Algorithm
Lingjie Chang
Scalable Computing: Practice and Experience | VOL. 25
Lingjie ChangLingjie Chang
16 Jun 2024
Scalable Computing: Practice and Experience | VOL. 25

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Cross-Project Defect Prediction Using Transfer Learning with Long Short-Term Memory Networks

Abstract

Talk to us

Similar Papers

More From: IET Software