Comparative analysis of resampling algorithms in the prediction of stroke diseases

Dauda Sani Abdullahi,Usman Musa Abdullahi,Dr Muhammad Sirajo Aliyu

doi:10.56919/usci.2123.011

Dauda Sani Abdullahi, Usman Musa Abdullahi + Show 1 more

Open Access

https://doi.org/10.56919/usci.2123.011

Copy DOI

Journal: UMYU Scientifica	Publication Date: Mar 30, 2023
License type: CC BY-NC 4.0

Affiliation: Federal University Kashere

Abstract

Stroke disease is a serious cause of death globally. Early predictions of the disease will save a lot of lives but most of the clinical datasets are imbalanced in nature including the stroke dataset, making the predictive algorithms biased towards the majority class. The objective of this research is to compare different data resampling algorithms on the stroke dataset to improve the prediction performances of the machine learning models. This paper considered five (5) resampling algorithms namely; Random over Sampling (ROS), Synthetic Minority oversampling Technique (SMOTE), Adaptive Synthetic (ADASYN), hybrid techniques like SMOTE with Edited Nearest Neighbor (SMOTE-ENN), and SMOTE with Tomek Links (SMOTE-TOMEK) and trained on six (6) machine learning classifiers namely; Logistic Regression (LR), Decision Tree (DT), K-nearest Neighbor (KNN), Support Vector Machines (SVM), Random Forest (RF), and XGBoost (XGB). The hybrid technique SMOTE-ENN influences the machine learning classifiers the best followed by the SMOTE technique while the combination of SMOTE and XGB perform better with an accuracy of 97.99% and G-mean score of 0.99, and auc_roc score of 0.99. Resampling algorithms balance the dataset and enhanced the predictive power of machine learning algorithms. Therefore, we recommend resampling stroke dataset in predicting stroke disease than modeling on the imbalanced dataset.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Comparative analysis of resampling algorithms in the prediction of stroke diseases

Abstract

Talk to us

Similar Papers

More From: UMYU Scientifica

Lead the way for us

Similar Papers

Data oversampling and imbalanced datasets: an investigation of performance for machine learning and feature engineering
Muhammad Mujahid ... Imran Ashraf
Journal of Big Data | VOL. 11
Muhammad Mujahid, et. al.Muhammad Mujahid ... Imran Ashraf
17 Jun 2024
Journal of Big Data | VOL. 11

Addressing the Big Data Multi-class Imbalance Problem with Oversampling and Deep Learning Neural Networks
V M González-Barcenas ... R M Valdovinos
-
V M González-Barcenas, et. al.V M González-Barcenas ... R M Valdovinos
01 Jan 2019
01 Jan 2019

Performance Comparison of Data Sampling Techniques to Handle Imbalanced Class on Prediction of Compound-Protein Interaction
Akhmad Rezki Purnajaya ... Medria Kusuma Dewi Hardhienata
Biogenesis: Jurnal Ilmiah Biologi | VOL. 8
Akhmad Rezki Purnajaya, et. al.Akhmad Rezki Purnajaya ... Medria Kusuma Dewi Hardhienata
30 Jun 2020
Biogenesis: Jurnal Ilmiah Biologi | VOL. 8

A machine learning and explainable artificial intelligence triage-prediction system for COVID-19
Varada Vivek Khanna ... Rajagopala Chadaga P
Decision Analytics Journal | VOL. 7
Varada Vivek Khanna, et. al.Varada Vivek Khanna ... Rajagopala Chadaga P
06 May 2023
Decision Analytics Journal | VOL. 7

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Comparative analysis of resampling algorithms in the prediction of stroke diseases

Abstract

Talk to us

Similar Papers

More From: UMYU Scientifica