Abstract
Causal inference enables machine learning methods to estimate treatment effects of medical interventions from electronic health records (EHRs). The prevalence of such observational data and the difficulty for randomized controlled trials (RCT) to cover all population/treatment relationships make these methods increasingly attractive for studying causal effects. However, researchers should be wary of many pitfalls. We propose and illustrate a framework for causal inference estimating the effect of albumin on mortality in sepsis using an Intensive Care database (MIMIC-IV) and comparing various sensitivity analyses to results from RCTs as gold-standard. The first step is study design, using the target trial concept and the PICOT framework: Population (patients with sepsis), Intervention (combination of crystalloids and albumin for fluid resuscitation), Control (crystalloids only), Outcome (28-day mortality), Time (intervention start within 24h of admission). We show that too large treatment-initiation times induce immortal time bias. The second step is selection of the confounding variables based on expert knowledge. Increasingly adding confounders enables to recover the RCT results from observational data. As the third step, we assess the influence of multiple models with varying assumptions, showing that a doubly robust estimator (AIPW) with random forests proved to be the most reliable estimator. Results show that these steps are all important for valid causal estimates. A valid causal model can then be used to individualize decision making: subgroup analyses showed that treatment efficacy of albumin was better for patients >60 years old, males, and patients with septic shock. Without causal thinking, machine learning is not enough for optimal clinical decision on an individual patient level. Our step-by-step analytic framework helps avoiding many pitfalls of applying machine learning to EHR data, building models that avoid shortcuts and extract the best decision-making evidence.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have