Diabetes Prediction: A Study of Various Classification based Data Mining Techniques

Sipra Sahoo,Bharat Jyoti Ranjan Sahoo,Smita Rath,Arup Kumar Mohanty,Tushar Mitra

doi:10.47893/ijcsi.2022.1191

Abstract

Data Mining is an integral part of KDD (Knowledge Discovery in Databases) process. It deals with discovering unknown patterns and knowledge hidden in data. Classification is a pivotal data mining technique with a very wide range of applications. Now a day’s diabetic has become a major disease which has almost crippled people across the globe. It is a medical condition that causes the metabolism to become dysfunctional and increases the blood sugar level in the body and it becomes a major concern for medical practitioner and people at large. An early diagnosis is the starting point for living well with diabetes. Classification Analysis on diabetic dataset is a part of this diagnosis process which can help to detect a diabetic patient from non-diabetic. In this paper classification algorithms are applied on the Pima Indian Diabetic Database which is collected from UCI Machine Learning Laboratory. Various classification algorithms which are Naïve Bayes Classifier, Logistic Regression, Decision Tree Classifier, Random Forest Classifier, Support Vector Classifier and XGBoost Classifier are analyzed and compared based on the accuracy delivered by the models.

Full Text