Minority-prediction-probability-based oversampling technique for imbalanced learning

Zhen Wei,Li Zhang,Lei Zhao

doi:10.1016/j.ins.2022.11.148

Abstract

In this study, we propose an oversampling method called the minority-predictive-probability-based synthetic minority oversampling technique (MPP-SMOTE) for imbalanced learning. First, MPP-SMOTE removes noisy samples from minority classes. Subsequently, it divides minority samples into two types (hard-to-learn and easy-to-learn) by predicting the probability of samples belonging to the minority class. For both sample types, we adopt a divide-and-conquer strategy. We separately calculate the probability of each sample being selected to generate a new synthetic sample. The relative density of a sample in both the majority and minority classes is considered in the method for calculating the selection probability of hard-to-learn samples, and the relative density of a sample in only the minority class is considered in that of easy-to-learn samples. Finally, according to the types and selection probabilities, MPP-SMOTE separately selects samples and generates synthetic samples based on them by using different sample-generation schemes. Experimental results reveal that the proposed method outperforms other oversampling methods in terms of three imbalanced-learning metrics for three common classifiers. According to the results, when a support vector machine classifier is applied, the area under the curve performance of the MPP-SMOTE improves by a factor of 1.44%.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Minority-prediction-probability-based oversampling technique for imbalanced learning

Abstract

Talk to us

Similar Papers

More From: Information Sciences

Lead the way for us

Journal: Information Sciences	Publication Date: Dec 6, 2022
Citations: 12

Similar Papers

DOSS: Dual Over Sampling Strategy for Imbalanced Data Classification
Qiushi Wang ... Jihoon Hong
-
Qiushi Wang, et. al.Qiushi Wang ... Jihoon Hong
01 Oct 2018
01 Oct 2018

LAD-SMOTE: A New Oversampling Method Based on Locally Adaptive Distance
Haoyang Wang ... He Huang
-
Haoyang Wang, et. al.Haoyang Wang ... He Huang
01 Nov 2018
01 Nov 2018

LVQ-SMOTE – Learning Vector Quantization based Synthetic Minority Over–sampling Technique for biomedical data
Munehiro Nakamura ... Yusuke Kajiwara
BioData Mining | VOL. 6
Munehiro Nakamura, et. al.Munehiro Nakamura ... Yusuke Kajiwara
02 Oct 2013
BioData Mining | VOL. 6

CDBH: A clustering and density-based hybrid approach for imbalanced data classification
Behzad Mirzaei ... Hossein Nezamabadi-Pour
Expert Systems with Applications | VOL. 164
Behzad Mirzaei, et. al.Behzad Mirzaei ... Hossein Nezamabadi-Pour
28 Sep 2020
Expert Systems with Applications | VOL. 164

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Minority-prediction-probability-based oversampling technique for imbalanced learning

Abstract

Talk to us

Similar Papers

More From: Information Sciences