Abstract

<p>In the practice of software project development, the developed project is a brand-new project. Defect prediction for this type of software project requires the use of other similar projects (i.e. source projects) to collect relevant data to build a defect prediction model, and make defect prediction for the project under development (i.e. target project). However, the prediction model built with the relevant data of the source project cannot achieve the ideal prediction performance when predicting the target project. The main reason is that there is a large data distribution difference between the source project and the target project. The data distribution difference is mainly in the distribution of features between projects and differences between instances. In response to the above problems, starting from both features and instances, a cross-project defect prediction method is proposed. This method first aligns the feature distribution based on the data of the existing target project and the source project data. Then, it selects the labeled instance that is similar to the unlabeled instance in the target project, and finally builds a defect prediction model based on the selected source project instances. Cross-project defect prediction experiments were carried out on the Relink datasets and the Promise datasets. Compared with the classic instance-based cross-project defect prediction method, significant improvements have been made in F-measure and AUC; compared with the prediction of within project defect prediction, it has achieved comparable performance.</p> <p> </p>

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call