Tool to handle imbalancing problem in software defect prediction using oversampling methods

Ruchika Malhotra,Shine Kamal

doi:10.1109/icacci.2017.8125957

Abstract

Data imbalancing is becoming a common problem to tackle in different fields like, defect prediction, change prediction, oil spills, medical diagnose etc. Various methods have been developed to handle imbalanced datasets in order to improve accuracy of the prediction models. Many studies have been carried out in the field of defect prediction for imbalanced datasets but most of them uses SMOTE oversampling method to handle the imbalanced data problem. There are many other oversampling methods which help to deal with imbalancing problem and are still unexplored particularly in the field of software defect prediction. This study develops a tool by implementing three of those unexplored oversampling methods namely ADASYN, SPIDER and Safe-Level-SMOTE. Furthermore, we analyze their performance in comparison to traditional method SMOTE. The performance of oversampling methods is evaluated by applying three machine learning techniques for defect prediction using object oriented metrics. The results are evaluated using two open source defect datasets. The result analysis showed that the prediction error decreased and performance of the machine learning techniques improved when balanced datasets were used with respect to three oversampling methods. Further, all of the three methods outperformed SMOTE while SPIDER oversampling method performed best in majority of the cases.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Tool to handle imbalancing problem in software defect prediction using oversampling methods

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Generative Oversampling Methods for Handling Imbalanced Data in Software Fault Prediction
Santosh Singh Rathore ... Satyendra Singh Chouhan
IEEE Transactions on Reliability | VOL. 71
Santosh Singh Rathore, et. al.Santosh Singh Rathore ... Satyendra Singh Chouhan
01 Jun 2022
IEEE Transactions on Reliability | VOL. 71

Optimization of software defects prediction in imbalanced class using a combination of resampling methods with support vector machine and logistic regression
Catur Iswahyudi ... Windyaning Ustyannie
JURNAL INFOTEL | VOL. 13
Catur Iswahyudi, et. al.Catur Iswahyudi ... Windyaning Ustyannie
09 Dec 2021
JURNAL INFOTEL | VOL. 13

An empirical study to investigate oversampling methods for improving software defect prediction using imbalanced data
Ruchika Malhotra ... Shine Kamal
Neurocomputing | VOL. 343
Ruchika Malhotra, et. al.Ruchika Malhotra ... Shine Kamal
04 Feb 2019
Neurocomputing | VOL. 343

Software defect prediction via transfer learning based neural network
Qimeng Cao ... Qing Sun
-
Qimeng Cao, et. al.Qimeng Cao ... Qing Sun
01 Oct 2015
01 Oct 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Tool to handle imbalancing problem in software defect prediction using oversampling methods

Abstract

Talk to us

Similar Papers