Impact of Unbalanced Classification on the Performance of Software Defect Prediction Models

K J Eldho

doi:10.17485/ijst/v15i6.2193

Abstract

Objectives: To propose a suitable imbalanced data classification model to split the dataset into two new datasets and to test the created imbalanced dataset by the prediction models. Methods: The imbalance defect data sets are taken from the PROMISE library and used for the performance evaluation. The results clearly demonstrate that the performance of three existing prediction classifier models, K-Nearest Neighbor (KNN), Naive Bayes (NB), and Back Propagation (BPN), is very susceptible in terms of unbalance of classification, while Support Vector Machine (SVM) and Extreme Learning Machine (ELM) are more stable. Findings: The outcome of this research reveals that applied SVM and ELM machine learning models improves the performance in defect prediction and records 29% more than KNN, and 19% more than NB and BPN. Novelty: According to the findings of a comprehensive study, the proposed machine learning new classification imbalance impact analysis method outperforms the existing ones in order to transform the original imbalance data set into a new data set with an increasing imbalance rate and be able to select models to evaluate different predictions on the new data set. Keywords: Software Fault Prediction Model; Imbalance Problem Classification; Artificial Intelligence; Smart Debugging; Unbalanced Classification

Highlights

Defect prediction is essential in the software field in terms of quality and reliability, and it is one of the major comparative research areas in the modern software engineering approach
The performance analysis was conducted against K-Nearest Neighbor (KNN), Naive Bayes (NB), and Back Propagation (BPN)[7] with Support Vector Machine (SVM) and Extreme Learning Machine (ELM) by using PROMISE datasets in the Weka tool under the Windows platform to evaluate the performance stability of different prediction models
The research study proposed a new model with machine learning techniques such as SVM and ELM to classify the imbalanced data in the PROMISE library to evaluate the software defect prediction

Summary

Introduction

Defect prediction is essential in the software field in terms of quality and reliability, and it is one of the major comparative research areas in the modern software engineering approach. Numerous defect prediction models have been introduced for the class imbalance problem by means of the continuous development of machine learning and data mining. Classifiers are created to eliminate errors and increase accuracy. Classification imbalance has gradually become the current dominant research hotspot in software engineering. Unbalanced classification refers to the phenomenon that the sample size distribution among different categories is unbalanced. In the binary classification problem, when the sample size of the two categories differs greatly, the classification imbalance problem appears. Classification imbalance problems are common and need to be addressed to

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Indian Journal of Science and Technology	Publication Date: Feb 15, 2022
Citations: 1	License type: cc-by

R Discovery Prime

R Discovery Prime

Impact of Unbalanced Classification on the Performance of Software Defect Prediction Models

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Indian Journal of Science and Technology

Lead the way for us

Similar Papers

Accuracy assessment of RFerns, NB, SVM, and kNN machine learning classifiers in aquaculture
Mustafa Çakir ... Okan Oral
Journal of King Saud University - Science | VOL. 35
Mustafa Çakir, et. al.Mustafa Çakir ... Okan Oral
12 Jun 2023
Journal of King Saud University - Science | VOL. 35

Study on the Influence of the Number of Features on the Performance of Software Defect Prediction Model
Mengtian Cui ... Yang Lu
-
Mengtian Cui, et. al.Mengtian Cui ... Yang Lu
05 Jul 2019
05 Jul 2019

Feature Selection Techniques and Classification Accuracy of Supervised Machine Learning in Text Mining
...
Journal of Information Engineering and Applications | VOL. 9
, et. al. ...
01 May 2019
Journal of Information Engineering and Applications | VOL. 9

EDAS Based Selection of Machine Learning Algorithm for Diabetes Detection
Sudhansh Sharma ... Bhavya Sharma
-
Sudhansh Sharma, et. al.Sudhansh Sharma ... Bhavya Sharma
04 Dec 2020
04 Dec 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Impact of Unbalanced Classification on the Performance of Software Defect Prediction Models

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Indian Journal of Science and Technology