Abstract
The primary goal of software defect prediction (SDP) is to predict the software defects for a specific software using historical data or data from past releases of software projects. The existing state of arts on SDP primarily discusses two prediction scenarios: Within Project Defect Prediction (WPDP) and Cross Project Defect Prediction (CPDP). The prediction model belongs to the WPDP scenario, which means that the model is trained and tested on different parts of the same dataset or trained on the dataset belonging to the previous version of the same project. While in the CPDP scenario, training and testing occur on different software project datasets. Due to the unavailability of historical datasets or prior releases of software defect datasets, CPDP is more useful in real-life scenarios. So, CPDP analysis is a very challenging issue in the SDP domain. Sometimes, machine learning (ML) models perform poorly due to inadequate training in the CPDP scenario. To support better CPDP performance, we must carefully build an ML model focusing on lower training error and overfitting issues. To address these issues, we have proposed a cross-project data preprocessing method to correlate the metrics of different project datasets, namely Unique Selection of Matched Metrics (USMM), using the KS test and Hungarian method. To further improve the CPDP performance, we have also used the dropout regularized deep learning (DRDL) model. We have deployed 34 software defect datasets to validate the DRDL model and USMM method. The experimental results demonstrate that the DRDL model using the USMM method (DRDL-USMM) is a promising model to enhance the prediction accuracy, and an improvement in the range of 3.3% to 8.5% as compared to the existing works in the CPDP scenario has been found.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: ACM Transactions on Management Information Systems
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.