Abstract

Widespread adoption of electronic health records (EHR) and objectives for meaningful use have increased opportunities for data-driven predictive applications in healthcare. These decision support applications are often fueled by large-scale, heterogeneous, and multilevel (i.e., defined at hierarchical levels of specificity) patient data that challenge the development of predictive models. Our objective is to develop and evaluate an approach for optimally specifying multilevel patient data for prediction problems. We present a general evolutionary computational framework to optimally specify multilevel data to predict individual patient outcomes. We evaluate this method for both flattening (single level) and retaining the hierarchical predictor structure (multiple levels) using data collected to predict critical outcomes for emergency department patients across five populations. We find that the performance of both the flattened and hierarchical predictor structures in predicting critical outcomes for emergency department patients improve upon the baseline models for which only a single level of predictor—either more general or more specific—is used (p < 0.001). Our framework for optimizing the specificity of multilevel data improves upon more traditional single-level predictor structures and can readily be adapted to similar problems in healthcare and other domains.

Highlights

  • Rapid accumulation of electronic health record (EHR) data and emphasis on meaningful use of health information technology (HIT) [1] has given rise to many modeling applications that attempt to predict individual patient outcomes

  • We model the fitness of each candidate solution using 5-fold cross-validated area under the receiver operating characteristics curve (AUC, commonly referred to as the C statistic), which is a standard measure of predictive performance for classification models [43]

  • These results suggest that the genetic algorithm (GA) achieved improvements for multiple subgroups in the population without sacrificing the model’s performance on other subgroups

Read more

Summary

Introduction

Rapid accumulation of electronic health record (EHR) data and emphasis on meaningful use of health information technology (HIT) [1] has given rise to many modeling applications that attempt to predict individual patient outcomes. The majority of these prognostic models target clinical outcomes (e.g., mortality, acute myocardial infarction, and septic shock); others aim at predicting service-oriented outcomes that span operations (e.g., wait times and length of stay), cost, quality, and patient satisfaction [2,3,4,5,6,7,8,9,10]. Multilevel data describing patients’ clinical conditions and medical interventions are commonly hypothesized predictors available in EHRs, but present unique challenges for model specification

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call