Cross-Project Defect Prediction Considering Multiple Data Distribution Simultaneously

Yu Zhao,Xiaoying Chen,Qiao Yu,Yi Zhu

doi:10.3390/sym14020401

Yu Zhao, Xiaoying Chen + Show 2 more

Open Access

PDF Available

https://doi.org/10.3390/sym14020401

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

Software testing is the main method for finding software defects at present, and symmetric testing and other methods have been widely used, but these testing methods will cause a lot of waste of resources. Software defect prediction methods can reasonably allocate testing resources by predicting the defect tendency of software modules. Cross-project defect prediction methods have huge advantages when faced with missing datasets. However, most cross-project defect prediction methods are designed based on the settings of a single source project and a single target project. As the number of public datasets continues to grow, the number of source projects and defect information is increasing. Therefore, in the case of multi-source projects, this paper explores the problems existing when using multi-source projects for defect prediction. There are two problems. First, in practice, it is not possible to know in advance which source project is used to build the model to obtain the best prediction performance. Second, if an inappropriate source project is used in the experiment to build the model, it can lead to lower performance issues. According to the problems found in the experiment, the paper proposed a multi-source-based cross-project defect prediction method MSCPDP. Experimental results on the AEEEM dataset and PROMISE dataset show that the proposed MSCPDP method effectively solves the above two problems and outperforms most of the current state-of-art cross-project defect prediction methods on F1 and AUC. Compared with the six cross-project defect prediction methods, the F1 median is improved by 3.51%, 3.92%, 36.06%, 0.49%, 17.05%, and 9.49%, and the ACU median is improved by −3.42%, 8.78%, 0.96%, −2.21%, −7.94%, and 5.13%.

Highlights

At present, the mainstream method to find code defects in software modules is still software testing technology, for example, the symmetrical test method [1,2]
This paper proposes a new solution, i.e., simultaneously using the knowledge of multiple source projects related to the target project to construct a defect prediction model
This paper first explores the practical significance and existing problems of the cross-project defect prediction method based on multiple sources, and according to the existing problems, we propose a multi-source-based cross-project defect prediction method MSCPDP, which can solve the problem of data distribution differences between multiple source projects and target projects when the data of multiple source projects are used at the same time

Summary

Introduction

The mainstream method to find code defects in software modules is still software testing technology, for example, the symmetrical test method [1,2] This type of method mainly relies on automatic or semi-automatic generation of a large number of test cases to test code blocks in the hope of finding software defects. This type of method is very effective, but it is a waste of resources. Thereby, those program modules that may be defective are tested in a targeted manner according to the predicted results

Methods

Results

Discussion

Conclusion

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Symmetry	Publication Date: Feb 17, 2022
Citations: 9	License type: CC BY 4.0

R Discovery Prime

Cross-Project Defect Prediction Considering Multiple Data Distribution Simultaneously

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Symmetry

Lead the way for us

Similar Papers

Improving Cross-Project Defect Prediction Methods with Data Simplification
Sousuke Amasaki ... Tomoyuki Yokogawa
-
Sousuke Amasaki, et. al.Sousuke Amasaki ... Tomoyuki Yokogawa
01 Aug 2015
01 Aug 2015

Revisiting Supervised and Unsupervised Methods for Effort-Aware Cross-Project Defect Prediction
Chao Ni ... Qing Gu
IEEE Transactions on Software Engineering | VOL. 48
Chao Ni, et. al.Chao Ni ... Qing Gu
01 Mar 2022
IEEE Transactions on Software Engineering | VOL. 48

A Hybrid Multiple Models Transfer Approach for Cross-Project Software Defect Prediction
Shenggang Zhang ... Yue Yan
International Journal of Software Engineering and Knowledge Engineering | VOL. 33
Shenggang Zhang, et. al.Shenggang Zhang ... Yue Yan
19 Dec 2022
International Journal of Software Engineering and Knowledge Engineering | VOL. 33

Joint feature representation learning and progressive distribution matching for cross-project defect prediction
Quanyi Zou ... Xiaowei Gu
Information and Software Technology | VOL. 137
Quanyi Zou, et. al.Quanyi Zou ... Xiaowei Gu
07 Apr 2021
Information and Software Technology | VOL. 137

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Cross-Project Defect Prediction Considering Multiple Data Distribution Simultaneously

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Symmetry