Diabetes is one of the most dangerous chronic disease that could lead to others serious complicating diseases. In Indonesia, the most common diabetes microvascular complications diseases are retinopathy, nephropathy and neuropathy. In order to prevent these complications to manifest, data mining technique to extract knowledge of risk factor for each complication becomes crucial. The goal of this research is to construct a prediction model for three major diabetes complication diseases in Indonesia and find out the significant features correlated with it. In this research, the diabetes risk factor narrowed into seven features, which are Age, Gender, BMI, Family history of diabetes, Blood pressure, duration of diabetes suffers and Blood glucose level. Thus, Naive Bayes Tree and C4.5 decision tree-based classification techniques and k-means clustering techniques were used to analyze this dataset. After this analysis, we evaluated the performance of each technique and found the correlated feature and sub feature as a disease risk factor for them. Resulting the most influential risk factor for Retinopathy is a female patient that having a hypertension crisis. As for Nephropathy, the most prominent risk factor is the duration of diabetes more than 4 years. But for Neuropathy, it dominated for female patients, with BMI more than 25. As for family history of diabetes, there is no distinct significant correlation with these complication diseases. The overall accuracy of the proposed model is 68% so it, could be used to as an alternative method to help predict diabetes complication diseases at an early stage.
Read full abstract