WOTBoost: Weighted Oversampling Technique in Boosting for imbalanced learning

Wenhao Zhang,Arash Naeim,Ramin Ramezani

doi:10.1109/bigdata47090.2019.9006091

Wenhao Zhang, Arash Naeim + Show 1 more

Open Access

https://doi.org/10.1109/bigdata47090.2019.9006091

Copy DOI

Publication Date: Dec 1, 2019

Citations: 54

Affiliation: University of California, Los Angeles

Abstract

Machine learning classifiers often stumble over imbalanced datasets where classes are not equally represented. This inherent bias towards the majority class may result in low accuracy in labeling minority class. Imbalanced learning is prevalent in many real-world applications, such as medical research, network intrusion detection, and fraud detection in credit card transactions, etc. A good number of research works have been reported to tackle this challenging problem. For example, Synthetic Minority Over-sampling TEchnique (SMOTE) and ADAptive SYNthetic sampling approach (ADASYN) use oversampling techniques to balance the skewed datasets. In this paper, we propose a novel method that combines a Weighted Oversampling Technique and ensemble Boosting method (WOTBoost) to improve the classification accuracy of minority data without sacrificing the accuracy of the majority class. WOTBoost adjusts its oversampling strategy at each round of boosting to synthesize more targeted minority data samples. The adjustment is enforced using a weighted distribution. We compare WOTBoost with other four classification models (i.e., decision tree, SMOTE + decision tree, ADASYN + decision tree, SMOTEBoost) extensively on 18 public accessible imbalanced datasets. WOTBoost achieves the best G mean on 6 datasets and highest AUC score on 7 datasets.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

WOTBoost: Weighted Oversampling Technique in Boosting for imbalanced learning

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Generative Adversarial Neural Networks based Oversampling Technique for Imbalanced Credit Card Dataset
Said El Kafhali ... Mohammed Tayebi
-
Said El Kafhali, et. al.Said El Kafhali ... Mohammed Tayebi
01 Dec 2022
01 Dec 2022

A Novel Weighted Ensemble Method to Overcome the Impact of Under-fitting and Over-fitting on the Classification Accuracy of the Imbalanced Data Sets
Ghulam Fatima ... Sana Saeed
Pakistan Journal of Statistics and Operation Research | VOL. -
Ghulam Fatima, et. al. Ghulam Fatima ... Sana Saeed
03 Jun 2021
Pakistan Journal of Statistics and Operation Research | VOL. -

Data oversampling and imbalanced datasets: an investigation of performance for machine learning and feature engineering
Muhammad Mujahid ... Imran Ashraf
Journal of Big Data | VOL. 11
Muhammad Mujahid, et. al.Muhammad Mujahid ... Imran Ashraf
17 Jun 2024
Journal of Big Data | VOL. 11

Evaluation of Oversampling Strategies in Machine Learning for Space Debris Detection
Mahmoud Khalil ... Panos Liatsis
-
Mahmoud Khalil, et. al.Mahmoud Khalil ... Panos Liatsis
01 Dec 2019
01 Dec 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

WOTBoost: Weighted Oversampling Technique in Boosting for imbalanced learning

Abstract

Talk to us

Similar Papers