Abstract

Heart diseaseis among the leading causes for death globally. Thus, early identification and treatment are indispensable to prevent the disease. In this work, we propose a framework based on machine learning algorithms to tackle such problems through the identification of risk variables associated to this disease. To ensure the success of our proposed model, influential data pre-processing and data transformation strategies are used to generate accurate data for the training model that utilizes the five most popular datasets (Hungarian, Stat log, Switzerland, Long Beach VA, and Cleveland) from UCI. The univariate feature selection technique is applied to identify essential features and during the training phase, classifiers, namely extreme gradient boosting (XGBoost), support vector machine (SVM), random forest (RF), gradient boosting (GB), and decision tree (DT), are deployed. Subsequently, various performance evaluations are measured to demonstrate accurate predictions using the introduced algorithms. The inclusion of Univariate results indicated that the DT classifier achieves a comparatively higher accuracy of around 97.75% than others. Thus, a machine learning approach is recognize, that can predict heart disease with high accuracy. Furthermore, the 10 attributes chosen are used to analyze the model's outcomes explainability, indicating which attributes are more significant in the model's outcome.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call