An extension of Synthetic Minority Oversampling Technique based on Kalman filter for imbalanced datasets

Thejas G.S,Yashas Hariprasad,S.S Iyengar,N.R Sunitha,Prajwal Badrinath,Shasank Chennupati

doi:10.1016/j.mlwa.2022.100267

Abstract

More often than not, data collected in real-time tends to be imbalanced i.e., the samples belonging to a particular class are significantly more than the others. This degrades the performance of the predictor. One of the most notable algorithms to handle such an imbalance in the dataset by fabricating synthetic data, is the “Synthetic Minority Oversampling Technique (SMOTE)”. However, data imbalance is not solely responsible for the poor performance of the classifier. Certain research works have demonstrated that noisy samples can have a significant role in misclassifying the dataset. Also, handling large data is computationally expensive. Hence, data reduction is imperative. In this work, we put forth a novel extension of SMOTE by integrating it with the Kalman filter. The proposed method, Kalman-SMOTE (KSMOTE), filters out the noisy samples in the final dataset after SMOTE, which includes both the raw data and the synthetically generated samples, thereby reducing the size of the dataset. Our model is validated with a wide range of datasets. An experimental analysis of the results shows that our model outperforms the presently available techniques.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Machine Learning with Applications	Publication Date: Jan 31, 2022
Citations: 2	License type: cc-by

R Discovery Prime

R Discovery Prime

An extension of Synthetic Minority Oversampling Technique based on Kalman filter for imbalanced datasets

Abstract

Talk to us

Similar Papers

More From: Machine Learning with Applications

Lead the way for us

Similar Papers

SMOTE-LOF for noise identification in imbalanced data classification
Asniar ... Kridanto Surendro
Journal of King Saud University - Computer and Information Sciences | VOL. 34
Asniar, et. al. Asniar ... Kridanto Surendro
09 Feb 2021
Journal of King Saud University - Computer and Information Sciences | VOL. 34

SMOTE-kTLNN: A hybrid re-sampling method based on SMOTE and a two-layer nearest neighbor classifier
Pengfei Sun ... Zhaohui Xu
Expert Systems with Applications | VOL. 238
Pengfei Sun, et. al.Pengfei Sun ... Zhaohui Xu
29 Sep 2023
Expert Systems with Applications | VOL. 238

Automated semiconductor wafer defect classification dealing with imbalanced data
Po-Hsuan Lee ... Zhe Wang
-
Po-Hsuan Lee, et. al.Po-Hsuan Lee ... Zhe Wang
20 Mar 2020
20 Mar 2020

SMOTE for Learning from Imbalanced Data: Progress and Challenges, Marking the 15-year Anniversary
Alberto Fernandez ... Salvador Garcia
Journal of Artificial Intelligence Research | VOL. 61
Alberto Fernandez, et. al.Alberto Fernandez ... Salvador Garcia
20 Apr 2018
Journal of Artificial Intelligence Research | VOL. 61

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An extension of Synthetic Minority Oversampling Technique based on Kalman filter for imbalanced datasets

Abstract

Talk to us

Similar Papers

More From: Machine Learning with Applications