The Comparison of Imbalanced Data Handling Method in Software Defect Prediction

Khadijah Khadijah,Priyo Sidik Sasongko

doi:10.22219/kinetik.v5i3.1049

Abstract

Software testing is a crucial process in software development life cycle which will affect the software quality. However, testing is a tedious task and resource consuming. Software testing can be conducted more efficiently by focusing this activitiy to software modules which is prone to defect. Therefore, an automated software defect prediction is needed. This research implemented Extreme Learning Machine (ELM) as classification algorithm because of its simplicity in training process and good generalization performance. Aside classification algorithm, the most important problem need to be addressed is imbalanced data between samples of positive class (prone to defect) and negative class. Such imbalance problem could bias the performance of classifier. Therefore, this research compared some approaches to handle imbalance problem between SMOTE (resampling method) and weighted-ELM (algorithm-level method).The results of experiment using 10-fold cross validation on NASA MDP dataset show that including imbalance problem handling in building software defect prediction model is able to increase the specificity and g-mean of model. When the value of imbalance ratio is not very small, the SMOTE is better than weighted-ELM. Otherwise, weighted-ELM is better than SMOTE in term of sensitivity and g-mean, but worse in term of specificity and accuracy.Software testing is a crucial process in software development life cycle which will affect the software quality. However, testing is a tedious task and resource consuming. Software testing can be conducted more efficiently by focusing this activitiy to software modules which is prone to defect. Therefore, an automated software defect prediction is needed. This research implemented Extreme Learning Machine (ELM) as classification algorithm because of its simplicity in training process and good generalization performance. Aside classification algorithm, the most important problem need to be addressed is imbalanced data between samples of positive class (prone to defect) and negative class. Such imbalance problem could bias the performance of classifier. Therefore, this research compared some approaches to handle imbalance problem between SMOTE (resampling method) and weighted-ELM (algorithm-level method).The results of experiment using 10-fold cross validation on NASA MDP dataset show that including imbalance problem handling in building software defect prediction model is able to increase the specificity and g-mean of model. When the value of imbalance ratio is not very small, the SMOTE is better than weighted-ELM. Otherwise, weighted-ELM is better than SMOTE in term of sensitivity and g-mean, but worse in term of specificity and accuracy.

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control	Publication Date: Aug 15, 2020
Citations: 1	License type: CC BY-NC 4.0

R Discovery Prime

R Discovery Prime

The Comparison of Imbalanced Data Handling Method in Software Defect Prediction

Abstract

Talk to us

Similar Papers

More From: Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control

Lead the way for us

Similar Papers

WR-ELM: Weighted Regularization Extreme Learning Machine for Imbalance Learning in Software Fault Prediction
Pravas Ranjan Bal ... Sandeep Kumar
IEEE Transactions on Reliability | VOL. 69
Pravas Ranjan Bal, et. al.Pravas Ranjan Bal ... Sandeep Kumar
15 Jun 2020
IEEE Transactions on Reliability | VOL. 69

Software defect prediction: A multi-criteria decision-making approach
A.O Balogun ... A.O Bajeh
Nigerian Journal of Technological Research | VOL. 15
A.O Balogun, et. al.A.O Balogun ... A.O Bajeh
30 Apr 2020
Nigerian Journal of Technological Research | VOL. 15

Unbalanced data processing for software defect prediction
Yang Qu ... Zhenming Li
-
Yang Qu, et. al.Yang Qu ... Zhenming Li
28 Oct 2022
28 Oct 2022

Software Defect Prediction: An ML Approach-Based Comprehensive Study
Kunal Anand ... Ajay Kumar Jena
-
Kunal Anand, et. al.Kunal Anand ... Ajay Kumar Jena
28 Oct 2022
28 Oct 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

The Comparison of Imbalanced Data Handling Method in Software Defect Prediction

Abstract

Talk to us

Similar Papers

More From: Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control