Prediction Models for Glaucoma in a Multicenter Electronic Health Records Consortium: The Sight Outcomes Research Collaborative

Sophia Y Wang,Anurag Shrivastava,Rohith Ravindranath,Sejal Amin,Dustin French,Arsham Sheybani,Paul A Edwards,Jenna Patnaik,Joshua D Stein,Brian L Vanderbeek,Paul Bryar,Fasika Woreta,Sophia Y Wang,Brian Mcmillian,Divya Srikumaran,Joshua Stein,Lindsey Delott,Brian C Stagg,Suzann Pershing,Barbara Wirostko,Baseer Ahmad,Judy Kim,Anne M Lynch,Jeffrey S Schultz,Wuqaas Munir,Saleha Munir

doi:10.1016/j.xops.2023.100445

Abstract

ObjectiveAdvances in artificial intelligence have enabled the development of predictive models for glaucoma. However, most work is single-center and uncertainty exists regarding the generalizability of such models. The purpose of this study was to build and evaluate machine learning (ML) approaches to predict glaucoma progression requiring surgery using data from a large multicenter consortium of electronic health records (EHR). DesignCohort study. Participants36,548 patients with glaucoma, as identified by ICD codes from six academic eye centers participating in the Sight OUtcomes Research Collaborative (SOURCE). MethodsWe developed machine learning models to predict whether glaucoma patients would progress to glaucoma surgery in the coming year (identified by CPT codes) using the following modeling approaches: 1) penalized logistic regression (lasso, ridge, and elastic net); 2) tree-based models (random forest, gradient boosted machines, XGBoost), and 3) deep learning models. Model input features included demographics, diagnosis codes, medications, and clinical information (intraocular pressure, visual acuity, refractive status, and central corneal thickness) available from structured EHR data. One site was reserved as an “external site” test set (N=1550); of the patients from the remaining sites, 10% each were randomly selected to be in development and test sets, with the remaining 27999 reserved for model training. Main Outcome and MeasuresEvaluation metrics included area under the receiver operating characteristic curve (AUROC) on the test set and the external site. Results6019 (16.5%) of 36,548 patients underwent glaucoma surgery. Overall, the AUROC ranged from 0.735-0.771 on the random test set and from 0.706-0.754 on the external test site, with the XGBoost and random forest model performing best, respectively. There was greatest performance decrease from the random test set to the external test site for the penalized regression models. ConclusionsML models developed using structured EHR data can reasonably predict whether glaucoma patients will need surgery, with reasonable generalizability to an external site. Additional research is needed to investigate the impact of protected class characteristics such as race or gender on model performance and fairness.

Full Text