Abstract

Increased availability of health data has led to rising use of machine learning approaches to advance healthcare research. However, traditional analysis methods remain widespread due to their ease of use, direct interpretation, and predictive utility. This study compares a random forests (RF) algorithm with a Cox proportional-hazards (CPH) model for predicting time to development of diabetic macular edema (DME) among patients with non-proliferative diabetic retinopathy (NPDR). US claims were used to create a longitudinal cohort of adult patients newly diagnosed with mild or moderate NPDR between 7/1/2007 and 4/30/2012 and followed for 5 years or until the earliest claim of DME progression or intervention. Baseline characteristics (BL) and over 400 variables based on HCUP clinical classification of ICD-9 diagnosis/procedure codes were analyzed by RF to predict time to DME progression whereas 30 variables including BL and Elixhauser Comorbidity Index components were analyzed by CPH. Nonlinearity was accommodated by CPH using restricted cubic splines. Overall prediction accuracy for correctly ranking patients by time to DME was 65% for both the RF and CPH approaches. The RF approach identified NPDR severity at diagnosis as adding 7% to prediction accuracy while other classification variables added at most 0.35%. The CPH model identified NPDR severity as having the largest effect size of any predictor variable (HR 3.41, 3.07-3.78 95% CI). CPH modeling also suggested a nonlinear trend with age (HR of 1.15 for 50 y.o. vs. 40 y.o.; HR of 0.84 for 70 y.o. vs. 60 y.o.). CPH allows direct interpretation of linear and nonlinear effects of predictors, while RF explores a larger pool of predictors and provides direct insight into relative contributions to prediction accuracy. In this study, the increased computational cost of using RF to incorporate more variables does not result in better performance compared to CPH.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call