Improving Cross-Project Software Defect Prediction Method Through Transformation and Feature Selection Approach

Yahaya Zakariyau Bala,Khaironi Yatim Sharif,Pathiah Abdul Samat,Noridayu Manshor

doi:10.1109/access.2022.3231456

Yahaya Zakariyau Bala, Khaironi Yatim Sharif + Show 2 more

Open Access

https://doi.org/10.1109/access.2022.3231456

Copy DOI

Journal: IEEE Access	Publication Date: Jan 1, 2023
Citations: 5	License type: CC BY-NC-ND 4.0

Affiliation: Universiti Putra Malaysia, Federal University Kashere

Abstract

In the traditional software defect prediction methodology, the historical record (dataset) of the same project is partitioned into training and testing data. In a practical situation where the project to be predicted is new, traditional software defect prediction cannot be employed. An alternative method is cross-project defect prediction, where the historical record of one project (source) is used to predict the defect status of another project (target). The cross-project defect prediction method solves the limitations of the historical records in the traditional software defect prediction method. However, the performance of cross-project defect prediction is relatively low because of the distribution differences between the source and target projects. Furthermore, the software defect dataset used for cross-project defect prediction is characterized by high-dimensional features, some of which are irrelevant and contribute to low performance. To resolve these two issues, this study proposes a transformation and feature selection approach to reduce the distribution difference and high-dimensional features in cross-project defect prediction. A comparative experiment was conducted on publicly available datasets from the AEEEM. Analysis of the results obtained shows that the proposed approach in conjugation with random forest as the classification model outperformed the other four state-of-the-art cross-project defect prediction methods based on the commonly used performance evaluation metric F1_score.

Full Text