A software defect prediction method using binary gray wolf optimizer and machine learning algorithms

Hao Wang,Bahman Arasteh,Keyvan Arasteh,Farhad Soleimanian Gharehchopogh,Alireza Rouhi

doi:10.1016/j.compeleceng.2024.109336

Abstract

ContextSoftware defect prediction means finding defect-prone modules before the testing process which will reduce testing cost and time. Machine learning methods can provide valuable models for developers to classify software faulty modules. ProblemThe inherent problem of the classification is the large volume of the training dataset's features, which reduces the accuracy and precision of the classification results. The selection of the effective features of the training dataset for classification is an NP-hard problem that can be solved using heuristic algorithms. MethodIn this study, a binary version of the Gray Wolf optimizer (bGWO) was developed to select the most effective features of the training dataset. By selecting the most influential features in the classification, the precision and accuracy of the software module classifiers can be increased. ContributionDeveloping a binary version of the gray wolf optimization algorithm to optimally select the effective features and creating an effective defect predictor are the main contributions of this study. To evaluate the effectiveness of the proposed method, five real-world and standard datasets have been used for the training and testing stages of the classifier. ResultsThe results indicate that among the 21 features of the train datasets, the basic complexity, sum of operators and operands, lines of codes, number of lines containing code and comments, and sum of operands have the greatest effect in predicting software defects. In this research, by combining the bGWO method and machine learning algorithms, accuracy, precision, recall, and F1 criteria have been considerably increased.

Full Text