Abstract

Heart disease is one of the most common problems and also a disease whose rate of increase has been higher in recent years. The complex task associated is exploitation of hidden patterns for effective and accurate prediction. There is plenty of data generated every year from various health institutes. This research focuses on development of supervised machine learning models for prediction of target heart disease. We have used publicly available heart disease datasets from University of California, Irvine (UCI) data repository for Cleveland, Switzerland, Hungarian and Long Beach. Various forms of preprocessing steps such as handling the missing values and null values, removal of duplicate entries is employed on these datasets in order to use it for developing effective models. The correlation between the features set and the target variable is studied. The prediction models are developed using effective machine learning techniques like Logistic Regression, Decision Trees, Naive Bayes, K-nearest neighbors, ensembles AdaBoost and XGBoost. The predictive performance of the developed models is compared with the help of stable accuracy measures like accuracy, precision, recall, F1-seore, Cohen's kappa and Area Under the Curve score. K- nearest neighbors was the best model for the Cleveland dataset with 86.81% accuracy. AdaBoost algorithm gave us the highest accuracy of 98% for Switzerland dataset. Bernoulli Naive Bayes predicted the heart diseases for Hungarian dataset with 84.26% accuracy. XGBoost gave better accuracy of 82.20% for VA Long Beach dataset. The results of the study advocate the applicability of machine learning techniques to predict heart disease.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.