A predictive analysis framework of heart disease using machine learning approaches

Shourav Molla,Md Ariful Islam Arif,Imran Mahmud,Shahed Hossain,F M Javed Mehedi Shamrat,Umme Umaima,Raisul Islam Rafi

doi:10.11591/eei.v11i5.3942

Abstract

Heart diseaseis among the leading causes for death globally. Thus, early identification and treatment are indispensable to prevent the disease. In this work, we propose a framework based on machine learning algorithms to tackle such problems through the identification of risk variables associated to this disease. To ensure the success of our proposed model, influential data pre-processing and data transformation strategies are used to generate accurate data for the training model that utilizes the five most popular datasets (Hungarian, Stat log, Switzerland, Long Beach VA, and Cleveland) from UCI. The univariate feature selection technique is applied to identify essential features and during the training phase, classifiers, namely extreme gradient boosting (XGBoost), support vector machine (SVM), random forest (RF), gradient boosting (GB), and decision tree (DT), are deployed. Subsequently, various performance evaluations are measured to demonstrate accurate predictions using the introduced algorithms. The inclusion of Univariate results indicated that the DT classifier achieves a comparatively higher accuracy of around 97.75% than others. Thus, a machine learning approach is recognize, that can predict heart disease with high accuracy. Furthermore, the 10 attributes chosen are used to analyze the model's outcomes explainability, indicating which attributes are more significant in the model's outcome.

Full Text