Classification of imbalanced medical data: An empirical study of machine learning approaches

Shikha Mundra,Ankit Mundra,Abha Kiran Rajpoot,Mandeep Kaur,Supriya Khaitan,Shounak Vijay,Mayank Kumar Goyal,Punit Gupta

doi:10.3233/jifs-219294

Abstract

Thousands of patients around the world affecting their health with various factor as age, body mass index, cholesterol levels, albumin levels and several other factor. Prediction of health outcome due to these factors at a proper time can be served as an early warning. Recent growth in machine learning algorithm inspired us to build a predictive model for better healthcare facilities. In our work we have focused on problem of noisy and imbalanced dataset in which majority class is favored over minority one that leads to false prediction. We have experimented with two publicly available medical imbalanced dataset which varies in its size as MIT’s GOSSIS death and PIMA Indians Diabetes Dataset based on binary class. In this model we have investigated 3 oversampling techniques (Synthetic Minority Oversampler, Random Oversampler and Adaptive Synthetic Sampler) along with two undersampling techniques (Random Undersampler and Near Miss) which were paired with 3 data reduction and cleaning methods namely Tomek Links, One Sided Selection and Edited Nearest Neighbors. At last, we found that combination of Adaptive Synthetic Sampler along with One Sided Selection perform better in case of large size dataset while combination of random oversampler along with Tomek Link showed better performance in case of low size data dataset. We have also analyzed that oversampling technique gives quite promising results in comparison to undersampling methods specifically when applied with machine learning classifiers as these classifiers are data hungry algorithms.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Classification of imbalanced medical data: An empirical study of machine learning approaches

Abstract

Talk to us

Similar Papers

More From: Journal of Intelligent & Fuzzy Systems

Lead the way for us

Journal: Journal of Intelligent & Fuzzy Systems	Publication Date: Jun 9, 2022
Citations: 5

Similar Papers

Hybrid Data-Level Techniques for Class Imbalance Problem
Anjana Gosain ... Deepika Singh
-
Anjana Gosain, et. al.Anjana Gosain ... Deepika Singh
02 Aug 2020
02 Aug 2020

Improved Hybrid Bag-Boost Ensemble With K-Means-SMOTE–ENN Technique for Handling Noisy Class Imbalanced Data
Arjun Puri ... Manoj Kumar Gupta
The Computer Journal | VOL. 65
Arjun Puri, et. al.Arjun Puri ... Manoj Kumar Gupta
06 May 2021
The Computer Journal | VOL. 65

Comparative analysis of resampling algorithms in the prediction of stroke diseases
Dauda Sani Abdullahi ... Dr Muhammad Sirajo Aliyu
UMYU Scientifica | VOL. 2
Dauda Sani Abdullahi, et. al.Dauda Sani Abdullahi ... Dr Muhammad Sirajo Aliyu
30 Mar 2023
UMYU Scientifica | VOL. 2

Enhancing Machine Learning Models Through PCA, SMOTE-ENN, and Stochastic Weighted Averaging
Youngjin Han ... Inwhee Joe
Applied Sciences | VOL. 14
Youngjin Han, et. al.Youngjin Han ... Inwhee Joe
25 Oct 2024
Applied Sciences | VOL. 14

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Classification of imbalanced medical data: An empirical study of machine learning approaches

Abstract

Talk to us

Similar Papers

More From: Journal of Intelligent &amp; Fuzzy Systems

More From: Journal of Intelligent & Fuzzy Systems