The rate of 30-day all-cause hospital readmissions can affect the funding a hospital receives. An accurate and reliable readmission prediction model could save money and increase quality-of-care. Few projects have explored formulating this task as a survival prediction problem, where models can exploit a real-valued time-to-readmission target. This paper demonstrates the effectiveness of a survival-inspired readmission model, especially when paired with a longitudinal patient representation that is agnostic to disease-cohort and predictive task. We forecast readmissions for a population-level cohort of 421,088 patients discharged in 2015 and 2016 from hospitals in Alberta, Canada. Clinical features and sequences of historical medical codes (calculated from at least four full years prior to discharge) from linked administrative sources serve as model inputs. We trained binary 30-day readmission models (XGBoost and a Deep Neural Network) and time-to-event readmission models (CoxPH and N-MTLR) with and without machine-learned medical knowledge at initialization, then compared against the popular LACE-based model using the AUROC score at 30 days (AUROC@30). Survival models are additionally evaluated using concordance, Integrated Brier, and L1-loss scores. All models that utilize sequence features markedly out-perform even the best models trained on only clinical features. Further, a time-to-event target improves predictive performance at 30 days, given the same model inputs and architecture. N-MTLR, using solely sequence inputs and initialized with pre-learned medical knowledge, achieves an average AUROC@30 of 0.8460 over five folds with a standard deviation of 0.003. All trained models match or out-perform the LACE baseline of 0.6587±0.003. Sequences of administrative medical codes contain rich predictive information for forecasting readmissions, and embedding medical knowledge a priori using machine learning provides readmission models an advantageous foundation for training. When combined with a model that can leverage a time-to-event target, excellent performance is possible on the 30-day all-cause readmission task using only administrative data.
Read full abstract