Abstract

The effectiveness of three Machine Learning (ML) algorithms: Support Vector Machine (SVM), Random Forest (RF) and K-Nearest Neighbour (KNN) techniques for the early diagnosis of heart diseases were evaluated. Heart disease’ dataset collected from kaggle.com data repository, which comprised of 303 data points with 13 features and a target variable were used and data preprocessing by data shuffling and dimension reduction were performed. The new dimension of the dataset was chosen such that 85.03% of the original information is retained. The preprocessed dataset was partitioned into 70% of the training set and 30% of the testing set. The ML algorithms were trained and tested for the diagnosis of cardiovascular diseases (CVD). The training performances of these models were evaluated with a k-fold cross-validation algorithm using 10 folds. The k-fold accuracy shows KNN with an accuracy of 0.837662, RF with an accuracy of 0.834091, and SVM with an accuracy of 0.814935. The test results also show KNN with an accuracy of 0.8, SVM with an accuracy of 0.7889, and RF with an accuracy of 0.7667. KNN emerged the best model both in training and test’s performances and is recommended for the early diagnosis of CVD.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call