Abstract
Coronary Artery Disease (CAD) is the leading cause of mortality worldwide. It is a complex heart disease that is associated with numerous risk factors and a variety of Symptoms. During the past decade, Coronary Artery Disease (CAD) has undergone a remarkable evolution. The purpose of this research is to build a prototype system using different Machine Learning Algorithms (models) and compare their performance to identify a suitable model. This paper explores three most commonly used Machine Learning Algorithms named as Logistic Regression, Support Vector Machine and Artificial Neural Network. To conduct this research, a clinical dataset has been used. To evaluate the performance, different evaluation methods have been used such as Confusion Matrix, Stratified K-fold Cross Validation, Accuracy, AUC and ROC. To validate the results, the accuracy and AUC scores have been validated using the K-Fold Cross-validation technique. The dataset contains class imbalance, so the SMOTE Algorithm has been used to balance the dataset and the performance analysis has been carried out on both sets of data. The results show that accuracy scores of all the models have been increased while training the balanced dataset. Overall, Artificial Neural Network has the highest accuracy whereas Logistic Regression has the least accurate among the trained Algorithms.
Highlights
Coronary Artery Disease is the number one cause of deaths World-Wide and of the 56.9 million deaths reported around the world in 2016, more than 54% were because of top 10 causes of death among which Ischaemic Heart Disease (Coronary Artery Disease) and Stroke were the biggest killers and they remained the top causes of death for the last 15 years globally [1].To function properly the Heart requires the supply of blood and the Heart muscles receive blood from Coronary Arteries
This paper explores three most commonly used Machine Learning Algorithms named as Logistic Regression, Support Vector Machine and Artificial Neural Network
After performing Statistical Analysis on the data set, it was found that the dataset does not contain any missing values, from the Exploratory Data Analysis, it is evident that there is a class imbalance in the dataset as patients with Coronary Artery Disease (CAD) are higher than Normal patients, to solve this issue, Synthetic Minority Oversampling Technique (SMOTE) Algorithm is applied on the dataset to balance the dataset
Summary
To function properly the Heart requires the supply of blood and the Heart muscles receive blood from Coronary Arteries. Coronary Artery Disease is the blockage or narrowing of the Coronary Arteries caused by hardening or clogging of these arteries due to the build-up of cholesterol or fatty deposits called plaque in the arteries inner walls. The plaque could restrict the flow of blood by clogging the artery or by causing abnormal artery tone or function. Without a proper supply of blood, the heart becomes starved of oxygen and vital nutrients resulting in Chest Pain. If blood supply is entirely cut-off to a portion of a Heart muscle or if the energy requirements of the heart become more than the supply of blood, the result is a heart attack clinic [2]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Journal of Data Analysis and Information Processing
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.