Subspace-based minority oversampling for imbalance classification

Tianjun Li,Yingxu Wang,Licheng Liu,Long Chen,C.L Philip Chen

doi:10.1016/j.ins.2022.11.108

Abstract

In pattern classification, the class imbalance problem always occurs when the number of observations in some classes is significantly different from that of other categories, which leads to the learning bias in the classifiers. One possible solution to this problem is to re-balance the training set by over-sampling the minority class. However, over-samplings always push the classification boundaries to the majority part, thus the recall increases while the precision decreases. To avoid this situation and better handle the class imbalance problem, this paper proposes a new over-sampling method, namely Subspace-based Minority Over-Sampling (abbr. SMO). This approach considers that each category of samples is formed by common and unique characteristics, and such characteristics can be extracted by subspace. To obtain the balanced data, the common part is over-sampled for more accurately depicting the minority, and the unique part can be expanded by some generative methods. The balanced data are obtained by restoring the generated products of the subspace to the original space. The experimental results demonstrate that the SMO has the ability to model complex data distributions and outperforms both classical and newly designed over-sampling algorithms. Also, SMO can be used to generate simple images, and the generation results of MNIST can be clearly identified by both human vision and machine vision.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Subspace-based minority oversampling for imbalance classification

Abstract

Talk to us

Similar Papers

More From: Information Sciences

Lead the way for us

Journal: Information Sciences	Publication Date: Nov 30, 2022
Citations: 16

Similar Papers

Optimization of software defects prediction in imbalanced class using a combination of resampling methods with support vector machine and logistic regression
Catur Iswahyudi ... Windyaning Ustyannie
JURNAL INFOTEL | VOL. 13
Catur Iswahyudi, et. al.Catur Iswahyudi ... Windyaning Ustyannie
09 Dec 2021
JURNAL INFOTEL | VOL. 13

Towards Coding for Human and Machine Vision: Scalable Face Image Coding
Shuai Yang ... Ling-Yu Duan
IEEE Transactions on Multimedia | VOL. 23
Shuai Yang, et. al.Shuai Yang ... Ling-Yu Duan
01 Jan 2020
IEEE Transactions on Multimedia | VOL. 23

A study of the behavior of several methods for balancing machine learning training data
Gustavo E A P A Batista ... Ronaldo C Prati
ACM SIGKDD Explorations Newsletter | VOL. 6
Gustavo E A P A Batista, et. al.Gustavo E A P A Batista ... Ronaldo C Prati
01 Jun 2004
ACM SIGKDD Explorations Newsletter | VOL. 6

Task-Driven Video Compression for Humans and Machines: Framework Design and Optimization
Xiaokai Yi ... Sam Kwong
IEEE Transactions on Multimedia | VOL. 25
Xiaokai Yi, et. al.Xiaokai Yi ... Sam Kwong
01 Jan 2023
IEEE Transactions on Multimedia | VOL. 25

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Subspace-based minority oversampling for imbalance classification

Abstract

Talk to us

Similar Papers

More From: Information Sciences