Abstract

Aim: The goal of this study is to compare the performances of Logistic Regression (LR), Artificial Neural Networks (ANN) and Decision Tree models, which are machine learning classification methods, in the diagnosis of Type 2 Diabetes Mellitus (DM) and to determine the most successful method. It is also the examination of risk factors affecting Type 2 DM using these models. Method: The study's data was collected from patients who visited the Diabetes and Thyroid polyclinic at the Inonu University Faculty of Medicine Turgut Ozal Medical Center, Department of Internal Medicine. The k-Nearest Neighbor algorithm, which is one of the missing value assignment methods, was used to eliminate the problems related to missing values. Sensitivity, accuracy, precision, specificity, AUC F1-score, and classification error were used as performance evaluation criteria. Evolutionary algorithm parameter optimization method was used to optimize the parameters of the ANN model. Missing value assignment, modeling and parameter optimization were done with Rapidminer Studio Free version 8.1. Results: Among the three methods applied in the diagnosis of Type 2 DM, the ANN gave the best classification performance. The accuracy, sensitivity, selectivity, precision, F1-score, AUC and classification error values obtained from this method are respectively; 98.94%, 100%, 97.73%, 98.04%, 99.01%, 0.978 and 1.06. For the ANN method, the importance values of the gender, long-term drug use, family history, concomitant disease, cortisone use, stress factor, high blood pressure, smoking, high cholesterol, heart disease, exercise status, carbohydrate use, alcohol consumption, vegetable use, meat use, age, weight, height, starting age, daily bread consumption, LDL, HDL, Total Cholesterol, Triglyceride, Fasting blood sugar the importance values of independent variables are respectively; 0.017, 0.009, 0.013, 0.017, 0.008, 0.016, 0.008, 0.006, 0.053, 0.024, 0.023, 0.040, 0.007, 0.020, 0.007, 0.046, 0.083, 0.049, 0.024, 0.066, 0.084, 0.083, 0.020, 0.031, 0.244. Conclusion: According to the performance criteria obtained from the three classification models used to predict Type 2 DM; it has been found that the best classification performance belongs to the ANN model. According to the ANN method, the three most important risk factors that may cause Type 2 DM were found to be fasting blood glucose, LDL, and HDL, respectively.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.