Abstract

Objectives: To propose a suitable imbalanced data classification model to split the dataset into two new datasets and to test the created imbalanced dataset by the prediction models. Methods: The imbalance defect data sets are taken from the PROMISE library and used for the performance evaluation. The results clearly demonstrate that the performance of three existing prediction classifier models, K-Nearest Neighbor (KNN), Naive Bayes (NB), and Back Propagation (BPN), is very susceptible in terms of unbalance of classification, while Support Vector Machine (SVM) and Extreme Learning Machine (ELM) are more stable. Findings: The outcome of this research reveals that applied SVM and ELM machine learning models improves the performance in defect prediction and records 29% more than KNN, and 19% more than NB and BPN. Novelty: According to the findings of a comprehensive study, the proposed machine learning new classification imbalance impact analysis method outperforms the existing ones in order to transform the original imbalance data set into a new data set with an increasing imbalance rate and be able to select models to evaluate different predictions on the new data set. Keywords: Software Fault Prediction Model; Imbalance Problem Classification; Artificial Intelligence; Smart Debugging; Unbalanced Classification

Highlights

  • Defect prediction is essential in the software field in terms of quality and reliability, and it is one of the major comparative research areas in the modern software engineering approach

  • The performance analysis was conducted against K-Nearest Neighbor (KNN), Naive Bayes (NB), and Back Propagation (BPN)[7] with Support Vector Machine (SVM) and Extreme Learning Machine (ELM) by using PROMISE datasets in the Weka tool under the Windows platform to evaluate the performance stability of different prediction models

  • The research study proposed a new model with machine learning techniques such as SVM and ELM to classify the imbalanced data in the PROMISE library to evaluate the software defect prediction

Read more

Summary

Introduction

Defect prediction is essential in the software field in terms of quality and reliability, and it is one of the major comparative research areas in the modern software engineering approach. Numerous defect prediction models have been introduced for the class imbalance problem by means of the continuous development of machine learning and data mining. Classifiers are created to eliminate errors and increase accuracy. Classification imbalance has gradually become the current dominant research hotspot in software engineering. Unbalanced classification refers to the phenomenon that the sample size distribution among different categories is unbalanced. In the binary classification problem, when the sample size of the two categories differs greatly, the classification imbalance problem appears. Classification imbalance problems are common and need to be addressed to

Objectives
Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.