Abstract

Diabetes is a chronic disease rarely detected and develops quickly. Diabetes can trigger other chronic diseases such as kidney failure and heart disease. Early detection is necessary to help patients treat diabetes before the disease becomes more severe. Various health examination methods to detect diabetes, but these examinations require medical expert action and cannot be carried out by anyone. In addition, examination costs are often unaffordable. This research aims to apply data mining methods, especially k-Nearest Neighbor (KNN), for early detection of diabetes patients based on disease symptoms and patient clinical data. KNN is used to classify patient symptoms and clinical data into two classes, diabetes and non-diabetes, calculating the distance between test data and training data using Euclidean Distance. The research results show that a lower k-value provides a higher accuracy value. However, accuracy at low k-values ​​is insufficient to conclude the performance of KNN for early diabetes detection. High accuracy at low k-values ​​has the potential for overfitting, and the model is not generalizing well. Apart from that, if you use a low k-value, the model only sees patterns from 1 or a few neighbors, which results in the pattern of the data not being captured by the KNN model using a k-value that is too high also risks the model becoming underfitting. The model is too general, which makes the model unreliable. This research made use of the k-fold cross-validation technique to circumvent these issues. It is possible to avoid overfitting in the constructed KNN model by employing this method. The researchers are employing k-fold=10 and k-fold=20 in their investigation. KNN This research carried out this analysis by looking at the accuracy of each iteration of the k and k-fold values. The higher the k-fold value, the more accuracy the KNN produces. Inversely proportional to the k-fold cross-validation value, the higher the k-value in KNN, the decreases the accuracy. The KNN method applied in this research provides an accuracy of 98.2692% with higher precision than recall. These findings suggest that KNN can be an effective and efficient tool for early diabetes detection.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.