Machine Learning Methods for Diabetes Prevalence Classification in Saudi Arabia

Entissar S Almutairi,Maysam F Abbod

doi:10.3390/modelling4010004

Entissar S Almutairi, Maysam F Abbod

Open Access

https://doi.org/10.3390/modelling4010004

Copy DOI

Journal: Modelling	Publication Date: Jan 25, 2023
Citations: 7	License type: CC BY 4.0

Affiliation: Brunel University London

Abstract

Machine learning algorithms have been widely used in public health for predicting or diagnosing epidemiological chronic diseases, such as diabetes mellitus, which is classified as an epi-demic due to its high rates of global prevalence. Machine learning techniques are useful for the processes of description, prediction, and evaluation of various diseases, including diabetes. This study investigates the ability of different classification methods to classify diabetes prevalence rates and the predicted trends in the disease according to associated behavioural risk factors (smoking, obesity, and inactivity) in Saudi Arabia. Classification models for diabetes prevalence were developed using different machine learning algorithms, including linear discriminant (LD), support vector machine (SVM), K -nearest neighbour (KNN), and neural network pattern recognition (NPR). Four kernel functions of SVM and two types of KNN algorithms were used, namely linear SVM, Gaussian SVM, quadratic SVM, cubic SVM, fine KNN, and weighted KNN. The performance evaluation in terms of the accuracy of each developed model was determined, and the developed classifiers were compared using the Classification Learner App in MATLAB, according to prediction speed and training time. The experimental results on the predictive performance analysis of the classification models showed that weighted KNN performed well in the prediction of diabetes prevalence rate, with the highest average accuracy of 94.5% and less training time than the other classification methods, for both men and women datasets.

Full Text