Abstract

Heart disease is one of the most fatal chronic diseases which involves physicians evaluating a variety of clinical and metabolic patient symptoms for life-saving decisions. An intelligent computer-aided heart disease prediction method can be an excellent decision support system for physicians compared to a stand-alone imprecise manual identification. Therefore, the aim of this study is to propose an ensemble feature selection and machine learning-based approach to predict heart disease based on patients’ clinical features. For this research purpose, the Cleveland heart disease dataset has been used, visualized, and pre-processed. Four different feature selection approaches (Pearson, PCA, Chi-2, and RFE) are followed by an ensemble methodology with the maximum voting approach to extract the most significant features and generate a dataset with reduced attributes. Then, the dataset (with and without feature reduction) is used for heart disease prediction by applying ten different machine learning classification models, which includes four conventional, five bagging & boosting ensemble, and one artificial neural network techniques. The comparative result analysis through multiple performance parameters reveals that the decreased number of features improves the performance for each of the models, and the ensemble classifiers outperform other types of classifiers. Therefore, the best performance has been gained using the ‘extreme gradient(XG)’ boosting ensemble classifier with 94.1% accuracy employing the dataset with reduced clinical features for heart disease prediction.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call