Abstract

Estimating the effects of an intervention from high-dimensional observational data is a challenging problem due to the existence of confounding. The task is often further complicated in healthcare applications where a set of observations may be entirely missing for certain patients at test time, thereby prohibiting accurate inference. In this paper, we address this issue using an approach based on the information bottleneck to reason about the effects of interventions. To this end, we first train an information bottleneck to perform a low-dimensional compression of covariates by explicitly considering the relevance of information for treatment effects. As a second step, we subsequently use the compressed covariates to perform a transfer of relevant information to cases where data are missing during testing. In doing so, we can reliably and accurately estimate treatment effects even in the absence of a full set of covariate information at test time. Our results on two causal inference benchmarks and a real application for treating sepsis show that our method achieves state-of-the-art performance, without compromising interpretability.

Highlights

  • Reasoning about the effects of an intervention is a key question across many applications such as healthcare [1,2], finance [3] and public policy [4,5]

  • Our goal is to demonstrate the ability of CEIB to accurately infer treatment effects, while simultaneously learning a low-dimensional, interpretable representation of confounding in cases where covariate information is systematically missing at test time

  • If we extend this to the extreme case of removing 8 covariates at test time, the out-of-sample error in predicting the Average Causal Effect (ACE) increases to 0.29 +/− 0.02

Read more

Summary

Introduction

Reasoning about the effects of an intervention is a key question across many applications such as healthcare [1,2], finance [3] and public policy [4,5]. The actions observed in the data may be determined by variables that impact the outcome, resulting in confounding that otherwise biases predictions if unaccounted for (e.g., socioeconomic status may dictate what kinds of treatments a patient can afford and affect their overall outcomes) [6]. Correcting for such confounding is crucial when estimating the effects of an intervention. Estimating treatment effects for these patients requires integrating over all the missing variables—an infeasible task in high-dimensional settings

Objectives
Methods
Findings
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call