Abstract

Diabetes Mellitus (DM) is one of the most prevalent diseases in the world today which is associated by having high glucose levels in the body either due to inadequate production of insulin or the body cell’s not responding towards the produced insulin. Data mining and machine learning techniques can be extremely useful in classification of DM considering the need to have a shift from current traditional methods which use sharp needles to draw blood towards a non - invasive method. The objective of this study is to perform DM classification using various machine learning algorithms. In this paper, individual classifiers such as Support Vector Machine, Naïve Bayes, Bayes Net, Decision Stump, k - Nearest Neighbors, Logistic Regression, Multilayer Perceptron and Decision Tree are experimented. Apart from that, ensemble methods such as bagging, boosting, hybrid classifier using combinations of Random Forest with other base classifiers and ensemble algorithm which is the Random Forest has also been studied. Proposed DM classification model is chosen based on an optimized model reflected by their accuracy and performance of the model. In this research, it was found that performance of ensemble method using hybrid classifier of Random Forest - Bayes Net model has proven to be the best DM classification model with an accuracy of 83.91% and AUC of 0.904 using the Pima Indian Diabetes Dataset (PIDD).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call