Improving the Prediction of Heart Failure Patients’ Survival Using SMOTE and Effective Data Mining Techniques

Abid Ishaq,Muhammad Umer,Seyedali Mirjalili,Saleem Ullah,Saima Sadiq,Vaibhav Rupapara,Michele Nappi

doi:10.1109/access.2021.3064084

Abstract

Cardiovascular disease is a substantial cause of mortality and morbidity in the world. In clinical data analytics, it is a great challenge to predict heart disease survivor. Data mining transforms huge amounts of raw data generated by the health industry into useful information that can help in making informed decisions. Various studies proved that significant features play a key role in improving performance of machine learning models. This study analyzes the heart failure survivors from the dataset of 299 patients admitted in hospital. The aim is to find significant features and effective data mining techniques that can boost the accuracy of cardiovascular patient’s survivor prediction. To predict patient’s survival, this study employs nine classification models: Decision Tree (DT), Adaptive boosting classifier (AdaBoost), Logistic Regression (LR), Stochastic Gradient classifier (SGD), Random Forest (RF), Gradient Boosting classifier (GBM), Extra Tree Classifier (ETC), Gaussian Naive Bayes classifier (G-NB) and Support Vector Machine (SVM). The imbalance class problem is handled by Synthetic Minority Oversampling Technique (SMOTE). Furthermore, machine learning models are trained on the highest ranked features selected by RF. The results are compared with those provided by machine learning algorithms using full set of features. Experimental results demonstrate that ETC outperforms other models and achieves 0.9262 accuracy value with SMOTE in prediction of heart patient’s survival.

Highlights

According to WHO, Heart Diseases are a leading cause of death worldwide [1]
Authors used the begging C45 ensemble learning approach for cardiovascular disease (CVD) prediction. They have achieved 68.96% accuracy for diagnosis of stenosis in the Right Coronary Artery (RCA), 61.46% accuracy in Left Circumflex (LCX), and 79.54% accuracy in Left Anterior Descending (LAD). Another group of researchers improved the results by applying the Support Vector Machine (SVM) model and achieved 80.50% accuracy for RCA, 86.14% accuracy for LAD and 83.17% accuracy for LCX [32]
Results showed that tree-based algorithms outperformed using nine features identified by Random Forest (RF) using Synthetic Minority Oversampling Technique (SMOTE) technique

Summary

INTRODUCTION

According to WHO, Heart Diseases are a leading cause of death worldwide [1]. It is quite difficult to identify the cardiovascular disease (CVD) because of some contributory factors which contribute to CVD like high blood pressure, cholesterol level, diabetics, abnormal pulse rate, and many other factors [2]. Different classification algorithms are used to predict the CVD in patients and death predictions due to the heart attack [14]. Even though aforementioned researchers showed interesting results by applying standard statistical techniques, such methods are inefficient for large-scale datasets leaving room for other machine learning algorithms. This motivated our attempts to help healthcare professionals by developing machine learning techniques in the diagnosis of CVD patients’ survival. Performance of tree-based, regression-based, and statistical-based models is compared using SMOTE technique in predicting survival of heart patients.

RELATED WORK

ANALYSIS AND DISCUSSION OF RESULTS

EXPERIMENTAL DESIGN

Findings

CONCLUSION

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2021
Citations: 257	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Improving the Prediction of Heart Failure Patients’ Survival Using SMOTE and Effective Data Mining Techniques

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Application of several machine learning algorithms for the prediction of afatinib treatment outcome in advanced-stage EGFR-mutated non-small-cell lung cancer.
Taeyun Kim ... Tae‐Won Jang
Thoracic cancer | VOL. 13
Taeyun Kim, et. al.Taeyun Kim ... Tae‐Won Jang
24 Oct 2022
Thoracic cancer | VOL. 13

Improving the performance of machine learning model predicting phase and crystal structure of high entropy alloys by the synthetic minority oversampling technique
K. Hareharen ... R. Raj Mohan
Journal of Alloys and Compounds | VOL. 991
K. Hareharen, et. al.K. Hareharen ... R. Raj Mohan
16 Apr 2024
Journal of Alloys and Compounds | VOL. 991

Novel machine learning algorithm to predict response to immunotherapy in patients with small cell and non-small cell lung cancer.
Lakshya Sharma ... Sola Michael Adeleke
Journal of Clinical Oncology | VOL. 41
Lakshya Sharma, et. al.Lakshya Sharma ... Sola Michael Adeleke
01 Jun 2023
Journal of Clinical Oncology | VOL. 41

Data oversampling and imbalanced datasets: an investigation of performance for machine learning and feature engineering
Muhammad Mujahid ... Imran Ashraf
Journal of Big Data | VOL. 11
Muhammad Mujahid, et. al.Muhammad Mujahid ... Imran Ashraf
17 Jun 2024
Journal of Big Data | VOL. 11

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Improving the Prediction of Heart Failure Patients’ Survival Using SMOTE and Effective Data Mining Techniques

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access