Diabetic Retinopathy (DR) is the leading cause of blindness among adults in the U.S. Since DR is asymptomatic at the early stages, diabetic patients do not feel the uncomfortable ophthalmic exams necessary. However, existing DR diagnosis approaches mainly rely on fundus images that require access to ophthalmologists and special equipment, which are typically unavailable in rural areas. Machine-learning-based predictive models could help primary care physicians identify patients with a high risk of DR and confidently recommend ophthalmic exams. However, existing DR prediction models require a large number of independent variables, making them challenging to utilize in a clinical setting. In this study, we designed a novel “Progressive Ablation Feature Selection method with XGBoost” which successfully reduced the number of predictors from 25 to 9 to create a more user-friendly DR prediction model without sacrificing accuracy with an Area Under Curve of 96.61%. This study suggests that diabetic retinopathy is closely associated with creatinine, followed by neuropathy, hematocrit, BUN, nephropathy, albumin, race, calcium, and sodium. We provide an insight into each selected feature and its medical associations with DR. The result of this work will help physicians use a small set of available variables to identify high-risk diabetic patients prone to develop DR. Medical doctors thus can intervene at the proper time to prevent vision loss. • More than 2.3 million diabetics across over 10 million visits analyzed. • A novel progressive ablation feature selection method. • Comes with medical analysis of selected features. • Diabetic retinopathy is closely associated with creatinine, followed by neuropathy, hematocrit, BUN, nephropathy, albumin, race, calcium, sodium, and anion gap blood.
Read full abstract