Abstract

Background and objectiveDiabetic retinopathy (DR) is a global health concern among diabetic patients. The objective of this study was to propose an explainable machine learning (ML)-based system for predicting the risk of DR. Materials and methodsThis study utilized publicly available cross-sectional data in a Chinese cohort of 6374 respondents. We employed boruta and least absolute shrinkage and selection operator (LASSO) based feature selection methods to identify the common predictors of DR. Using the identified predictors, we trained and optimized four widly applicable models (artificial neural network, support vector machine, random forest, and extreme gradient boosting (XGBoost) to predict patients with DR. Moreover, shapely additive explanation (SHAP) was adopted to show the contribution of each predictor of DR in the prediction. ResultsCombining Boruta and LASSO method revealed that community, TCTG, HDLC, BUN, FPG, HbAlc, weight, and duration were the most important predictors of DR. The XGBoost-based model outperformed the other models, with an accuracy of 90.01%, precision of 91.80%, recall of 97.91%, F1 score of 94.86%, and AUC of 0.850. Moreover, SHAP method showed that HbA1c, community, FPG, TCTG, duration, and UA1b were the influencing predictors of DR. ConclusionThe proposed integrating system will be helpful as a tool for selecting significant predictors, which can predict patients who are at high risk of DR at an early stage in China.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call