Exploring Machine Learning Classifiers for Medical Datasets

Ranyah Taha,Nabil Hewahi,Sara Alshakrani

doi:10.1109/icdabi53623.2021.9655862

Abstract

This research paper studies different classification algorithms with the aim of finding the best Machine Learning (ML) classifiers for medical datasets to help Clinical Decision Support Systems (CDSS). This will help physicians in making more accurate and early diagnoses. Many studies used single medical dataset to evaluate ML classifiers but its unsatisfying and doesn’t reflect the capability of the classifier on different medical datasets. In this research paper, eight ML classifiers experimented on five medical datasets gathered from UCI machine learning repository. The Random Forest (RF) classifier obtained the highest accuracy average with 83.822 %, also Logistic regression (LR), Decision Tree (DT) and Gradient Boosting showed similar performance with 81.666%, 81.364%, 81.034%, while AdaBoost and Gaussian Naïve Bayes (GNB) obtained 79.504% and 78.126%, Moreover, the last two classifiers in this research obtained the lower accuracy averages are K Nearest Neighbor (KNN) with 68.208% and Artificial Neural Network (ANN) with 55.49%.

Full Text