Abstract

Heart disease is recognized as one of the leading factors of death rate worldwide. Biomedical instruments and various systems in hospitals have massive quantities of clinical data. Therefore, understanding the data related to heart disease is very important to improve prediction accuracy. This article has conducted an experimental evaluation of the performance of models created using classification algorithms and relevant features selected using various feature selection approaches. For results of the exploratory analysis, ten feature selection techniques, i.e., ANOVA, Chi-square, mutual information, ReliefF, forward feature selection, backward feature selection, exhaustive feature selection, recursive feature elimination, Lasso regression, and Ridge regression, and six classification approaches, i.e., decision tree, random forest, support vector machine, K-nearest neighbor, logistic regression, and Gaussian naive Bayes, have been applied to Cleveland heart disease dataset. The feature subset selected by the backward feature selection technique has achieved the highest classification accuracy of 88.52%, precision of 91.30%, sensitivity of 80.76%, and f-measure of 85.71% with the decision tree classifier.

Highlights

  • With the enhancement of the information era, computeraided systems generate massive amounts of raw data, enhancing the new center of power

  • We develop many predictive models, including feature selection, and evaluate them against various performance measures, including accuracy, precision, and recall, to identify the most successful ones that might be utilized for heart disease prediction and benefit the medical community

  • Python was used as the programming language in this comparative analysis process to build the analytical model on Jupyter (Anaconda) Notebook. is provides benefits to the dataset exploration and allows effective pattern identification

Read more

Summary

Introduction

With the enhancement of the information era, computeraided systems generate massive amounts of raw data, enhancing the new center of power. Data mining methods enable the efficient determination of whether patients are at an increased risk of heart disease at an early stage and, enable the decrease of the costs of diagnosis and treatment. In this respect, researchers investigated feature selection approaches and various classifiers on various heart disease datasets, including Statlog, Cleveland, Hungary, VA Long Beach, and Switzerland datasets from the UCI Machine Learning Repository, as well as the Z-Alizadeh Sani datasets. E purpose of this study is to determine the effect of several feature selection algorithms classified as filter, wrapper, and embedded techniques on improving the prediction of heart disease. We develop many predictive models, including feature selection, and evaluate them against various performance measures, including accuracy, precision, and recall, to identify the most successful ones that might be utilized for heart disease prediction and benefit the medical community

Literature Survey
Embedded Methods
F5 F6 F7 F8 F9 F10 F11
F13 F12 F9 F3 F10 F8 F11 F5 F6 F2 F1 F7 F4
F2 F3 F4 F5 F6 F7 F8 F9 F10 F11 F12 F13
Results and Discussion
Conclusion and Future

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.