Abstract
Diabetes is a chronic disease that arises from excess sugar levels in the body and lack of exercise intensity resulting in a buildup in the blood. Indonesia ranks fifth as the country with the largest number of people with diabetes based on a report from the International Diabetes Federation (IDF). The reason is that people with diabetes do not realize that they have diabetes, so there is a need for early detection in knowing this. The purpose of this research is to improve the performance of the Hard Voting Classifier model combining the Decision Tree, Random Forest, and XGBoost algorithms with the ADASYN oversampling technique that handles data imbalance in diabetes prediction. This study uses patient information data with a total of 1000 data and 14 features from the Medical City Hospital laboratory, Iraq. The results of this study show an increase in the performance of the prediction model with an accuracy value of 99.0%, precision 99.1%, recall 99.0%, and f1-score 98.98% without using ADASYN. Then get an accuracy value of 99.8%, precision 99.8%, recall 99.8%, and f1-score 99.8% by using ADASYN as an oversampling technique. This shows that there is an increase in the performance of the Hard Voting Classifier model so that it produces accurate predictions of diabetes, where the correctness of diabetes prediction is very good.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have