Abstract
Heart disease is recognized as one of the leading factors of death rate worldwide. Biomedical instruments and various systems in hospitals have massive quantities of clinical data. Therefore, understanding the data related to heart disease is very important to improve prediction accuracy. This article has conducted an experimental evaluation of the performance of models created using classification algorithms and relevant features selected using various feature selection approaches. For results of the exploratory analysis, ten feature selection techniques, i.e., ANOVA, Chi-square, mutual information, ReliefF, forward feature selection, backward feature selection, exhaustive feature selection, recursive feature elimination, Lasso regression, and Ridge regression, and six classification approaches, i.e., decision tree, random forest, support vector machine, K-nearest neighbor, logistic regression, and Gaussian naive Bayes, have been applied to Cleveland heart disease dataset. The feature subset selected by the backward feature selection technique has achieved the highest classification accuracy of 88.52%, precision of 91.30%, sensitivity of 80.76%, and f-measure of 85.71% with the decision tree classifier.
Highlights
With the enhancement of the information era, computeraided systems generate massive amounts of raw data, enhancing the new center of power
We develop many predictive models, including feature selection, and evaluate them against various performance measures, including accuracy, precision, and recall, to identify the most successful ones that might be utilized for heart disease prediction and benefit the medical community
Python was used as the programming language in this comparative analysis process to build the analytical model on Jupyter (Anaconda) Notebook. is provides benefits to the dataset exploration and allows effective pattern identification
Summary
With the enhancement of the information era, computeraided systems generate massive amounts of raw data, enhancing the new center of power. Data mining methods enable the efficient determination of whether patients are at an increased risk of heart disease at an early stage and, enable the decrease of the costs of diagnosis and treatment. In this respect, researchers investigated feature selection approaches and various classifiers on various heart disease datasets, including Statlog, Cleveland, Hungary, VA Long Beach, and Switzerland datasets from the UCI Machine Learning Repository, as well as the Z-Alizadeh Sani datasets. E purpose of this study is to determine the effect of several feature selection algorithms classified as filter, wrapper, and embedded techniques on improving the prediction of heart disease. We develop many predictive models, including feature selection, and evaluate them against various performance measures, including accuracy, precision, and recall, to identify the most successful ones that might be utilized for heart disease prediction and benefit the medical community
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Applied Computational Intelligence and Soft Computing
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.