Abstract
In software fault prediction systems, there are many hindrances for detecting faulty modules, such as missing values or samples, data redundancy, irrelevance features, and correlation. Many researchers have built a software bug prediction (SBP) model, which classify faulty and non-faulty module which are associated with software metrics. Till now very few works has been done which addresses the class imbalance problem in SBP. The main objective of this paper is to reveal the favorable result by feature selection and machine learning methods to detect defective and non-defective software modules. We propose a rudimentary classification based framework Bug Prediction using Deep representation and Ensemble learning (BPDET) techniques for SBP. It combinedly applies by ensemble learning (EL) and deep representation(DR). The software metrics which are used for SBP are mostly conventional. Staked denoising auto-encoder (SDA) is used for the deep representation of software metrics, which is a robust feature learning method. Propose model is mainly divided into two stages: deep learning stage and two layers of EL stage (TEL). The extraction of the feature from SDA in the very first step of the model then applied TEL in the second stage. TEL is also dealing with the class imbalance problem. The experiment mainly performed NASA (12) datasets, to reveal the efficiency of DR, SDA, and TEL. The performance is analyzed in terms of Mathew co-relation coefficient (MCC), the area under the curve (AUC), precision-recall area (PRC), F-measure and Time. Out of 12 dataset MCC values over 11 datasets, ROC values over 6 datasets, PRC values overall 12 datasets and F-measure over 8 datasets surpass the existing state of the art bug prediction methods. We have tested BPDET using Wilcoxon rank sum test which rejects the null hypothesis at α = 0.025. We have also tested the stability of the model over 5, 8, 10, 12, and 15 fold cross-validation and got similar results. Finally, we conclude that BPDET is a stable and outperformed on most of the datasets compared with EL and another state of the art techniques.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.