Abstract

Software testing is the main method for finding software defects at present, and symmetric testing and other methods have been widely used, but these testing methods will cause a lot of waste of resources. Software defect prediction methods can reasonably allocate testing resources by predicting the defect tendency of software modules. Cross-project defect prediction methods have huge advantages when faced with missing datasets. However, most cross-project defect prediction methods are designed based on the settings of a single source project and a single target project. As the number of public datasets continues to grow, the number of source projects and defect information is increasing. Therefore, in the case of multi-source projects, this paper explores the problems existing when using multi-source projects for defect prediction. There are two problems. First, in practice, it is not possible to know in advance which source project is used to build the model to obtain the best prediction performance. Second, if an inappropriate source project is used in the experiment to build the model, it can lead to lower performance issues. According to the problems found in the experiment, the paper proposed a multi-source-based cross-project defect prediction method MSCPDP. Experimental results on the AEEEM dataset and PROMISE dataset show that the proposed MSCPDP method effectively solves the above two problems and outperforms most of the current state-of-art cross-project defect prediction methods on F1 and AUC. Compared with the six cross-project defect prediction methods, the F1 median is improved by 3.51%, 3.92%, 36.06%, 0.49%, 17.05%, and 9.49%, and the ACU median is improved by −3.42%, 8.78%, 0.96%, −2.21%, −7.94%, and 5.13%.

Highlights

  • At present, the mainstream method to find code defects in software modules is still software testing technology, for example, the symmetrical test method [1,2]

  • This paper proposes a new solution, i.e., simultaneously using the knowledge of multiple source projects related to the target project to construct a defect prediction model

  • This paper first explores the practical significance and existing problems of the cross-project defect prediction method based on multiple sources, and according to the existing problems, we propose a multi-source-based cross-project defect prediction method MSCPDP, which can solve the problem of data distribution differences between multiple source projects and target projects when the data of multiple source projects are used at the same time

Read more

Summary

Introduction

The mainstream method to find code defects in software modules is still software testing technology, for example, the symmetrical test method [1,2] This type of method mainly relies on automatic or semi-automatic generation of a large number of test cases to test code blocks in the hope of finding software defects. This type of method is very effective, but it is a waste of resources. Thereby, those program modules that may be defective are tested in a targeted manner according to the predicted results

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call