Early Diagnosis of Diabetes Mellitus Using Data Mining and Classification Techniques

Younes Kiani,Seyed Ataaldin Mahmoudinejad Dezfuli,Seyed Vafaaldin Mahmoudinejad Dezfuli,Seyedeh Razieh Mahmoudinejad Dezfuli

doi:10.5812/jjcdc.94173

Younes Kiani, Seyed Ataaldin Mahmoudinejad Dezfuli + Show 2 more

Open Access

https://doi.org/10.5812/jjcdc.94173

Copy DOI

Abstract

Background: According to the World Health Organization, the seventh major cause of human death in 2030 will be diabetes, which of course is a very severe disease and if not treated thoroughly and on time, can lead to critical problems, including death. Accordingly, diabetes is one of the main priorities in medical science researches, which usually produce lots of information. The role of data mining methods in diabetes research is critical, which is considered as one of the optimum procedures of extracting knowledge from a large amount of diabetes-related data. Objectives: This research has focused on developing an ensemble system using data-mining methods based on three classification methods, namely, weighted k-nearest neighbor, simple decision tree and logistic regression algorithms to detect diabetes mellitus of the human. Methods: The proposed ensemble method algorithm applies votes given by each of the classifiers to attain the final result. This voting mechanism considers each estimation of the classifiers as an input to the ensemble system and then computes the statistical mode for its output to get the majority vote. Results: Apparently, these classifiers give the accuracy of 77.00%, 77.30%, 79.30%, and 80.60% for decision tree, weighted k-nearest neighbor, logistic regression, and the ensemble method, respectively. Conclusions: The results of the proposed method illustrate an acceptable improvement of accuracy compared to other methods. Consequently, it supports the idea that hybrid approaches are more effective in comparison with the simple classification methods that use classifiers separately.

Full Text