Abstract

Diabetes is a condition caused by an imbalance between the need for insulin in the body and insufficient insulin production by the pancreas, causing an increase in blood sugar concentration. This study aims to find the best classification performance on diabetes datasets with the LightGBM method. The dataset used consists of 768 rows and 9 columns, with target values of 0 and 1. In this study, resampling is applied to overcome data imbalance using SMOTE and perform hyperparameter optimization. Model evaluation is performed using confusion matrix and various metrics such as accuracy, recall, precision and f1-score. This research conducted several tests. In hyperparameter optimization tests using GridSearchCV and RandomSearchCV, the LightGBM method showed good performance. In tests that apply data resampling, the LightGBM method achieves the highest accuracy, namely the LightGBM method with GridSearchCV optimization with the highest accuracy reaching 84%, while LightGBM with RandomSearchCV optimization reaches 82% accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call