An Undersampling Method Approaching the Ideal Classification Boundary for Imbalance Problems

Wensheng Zhou,Chen Liu,Lei Jiang,Peng Yuan

doi:10.3390/app14135421

Abstract

Data imbalance is a common problem in most practical classification applications of machine learning, and it may lead to classification results that are biased towards the majority class if not dealt with properly. An effective means of solving this problem is undersampling in the borderline area; however, it is difficult to find the area that fits the classification boundary. In this paper, we present a novel undersampling framework, whereby the clustering of samples in the majority class is conducted and segmentation is then performed in the boundary area according to the clusters obtained; this enables a better shape that fits the classification boundary to be obtained via the performance of random sampling in the borderline area of these segments. In addition, we hypothesize that there exists an optimal number of classifiers to be integrated into the method of ensemble learning that utilizes multiple classifiers that have been obtained via sampling to promote the algorithm. After passing the hypothesis test, we apply the improved algorithm to the newly developed method. The experimental results show that the proposed method works well.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

An Undersampling Method Approaching the Ideal Classification Boundary for Imbalance Problems

Abstract

Talk to us

Similar Papers

More From: Applied Sciences

Lead the way for us

Journal: Applied Sciences	Publication Date: Jun 22, 2024
License type: CC BY 4.0

Similar Papers

Ensemble Learning Based on Active Example Selection for Solving Imbalanced Data Problem in Biomedical Data
Min Su Lee ... Sangyoon Oh
-
Min Su Lee, et. al.Min Su Lee ... Sangyoon Oh
01 Nov 2009
01 Nov 2009

Ensemble vs. Data Sampling: Which Option Is Best Suited to Improve Classification Performance of Imbalanced Bioinformatics Data?
Taghi M Khoshgoftaar ... Amri Napolitano
-
Taghi M Khoshgoftaar, et. al.Taghi M Khoshgoftaar ... Amri Napolitano
01 Nov 2015
01 Nov 2015

Imbalanced Deep Learning by Minority Class Incremental Rectification.
Qi Dong ... Xiatian Zhu
IEEE Transactions on Pattern Analysis and Machine Intelligence | VOL. 41
Qi Dong, et. al.Qi Dong ... Xiatian Zhu
03 May 2018
IEEE Transactions on Pattern Analysis and Machine Intelligence | VOL. 41

CHARACTERIZATION OF MORTALITY PREDICTION: AN ENSEMBLE LEARNING ANALYSIS USING THE MIMIC-III DATASET
Anıl Burcu Özyurt Seri̇m
Journal of Scientific Reports-A | VOL. -
Anıl Burcu Özyurt Seri̇mAnıl Burcu Özyurt Seri̇m
30 Sep 2023
Journal of Scientific Reports-A | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An Undersampling Method Approaching the Ideal Classification Boundary for Imbalance Problems

Abstract

Talk to us

Similar Papers

More From: Applied Sciences