Bayesian Causal Inference with Bipartite Record Linkage

Sharmistha Guha,Jerome P. Reiter,Andrea Mercatanti

doi:10.1214/21-ba1297

Sharmistha Guha, Jerome P. Reiter + Show 1 more

Open Access

https://doi.org/10.1214/21-ba1297

Copy DOI

Abstract

In some scenarios, the observational data needed for causal inferences are spread over two data files. In particular, we consider scenarios where one file includes covariates and the treatment measured on a set of individuals, and a second file includes responses measured on another, partially overlapping set of individuals. In the absence of error-free direct identifiers like social security numbers, straightforward merging of separate files is not feasible, so that records must be linked using error-prone variables such as names, birth dates, and demographic characteristics. Typical practice in such situations generally follows a two-stage procedure: first link the two files using a probabilistic linkage technique, then make causal inferences with the linked dataset. This does not propagate uncertainty due to imperfect linkages to the causal inference, nor does it leverage relationships among the study variables to improve the quality of the linkages. We propose a joint model for simultaneous Bayesian inference on probabilistic linkage and causal effects that addresses these deficiencies. Using simulation studies and theoretical arguments, we show that the joint model can improve the accuracy of estimated treatment effects, as well as the record linkages, compared to the two-stage modeling option. We illustrate the joint model using a constructed causal study of the effects of debit card possession on household spending.

Highlights

In some scenarios, researchers seek to make causal inferences from variables spread over two datasets
We present the joint model for Bayesian causal inference and record linkage for the setting where the outcomes y are in File A, and the covariates x and the treatment status w are in File B
Results based on outcome models with propensity scores computed from all records in the 1995 data are presented in Supplement I; they are essentially identical to what we present here

Summary

Introduction

Researchers seek to make causal inferences from variables spread over two datasets. The researcher first links records using a probabilistic record linkage model based on indirect identifiers, not taking into account available information on the outcome, covariate or treatment status. We follow the Bayesian paradigm for causal inference and posit models for the missing potential outcomes, conditional on the linking status and known covariates. Wortman and Reiter (2018) introduced the concept of allowing the causal model to inform the linkage model Their (non-Bayesian) approach uses point estimates of average causal effects to determine the thresholds at which record pairs are declared links in a Fellegi and Sunter (1969) algorithm.

Background and Notation for Bayesian Causal Inference

Strong ignorability

Background and Notation for Probabilistic Record Linkage

Joint Model for Bayesian Causal Inference and Record Linkage

Posterior Computation

Simulation Studies

Simulated Data Generation

Results

Causal Study of Debit Cards

Data Description and Background

Discussion and Future

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Bayesian Analysis	Publication Date: Dec 1, 2022
Citations: 2	License type: cc-by

R Discovery Prime

R Discovery Prime

Bayesian Causal Inference with Bipartite Record Linkage

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Bayesian Analysis

Lead the way for us

Similar Papers

Bayesian causal inference: a critical review.
Fan Li ... Peng Ding
Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences | VOL. 381
Fan Li, et. al.Fan Li ... Peng Ding
27 Mar 2023
Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences | VOL. 381

Creating an infrastructure for comparative effectiveness research in emergency medical services.
Christopher W Seymour ... Manish N Shah
Academic emergency medicine : official journal of the Society for Academic Emergency Medicine | VOL. 21
Christopher W Seymour, et. al.Christopher W Seymour ... Manish N Shah
01 May 2014
Academic emergency medicine : official journal of the Society for Academic Emergency Medicine | VOL. 21

Frequentist and Bayesian Causal Inference in Tests of Hypotheses
Ingo Rohlfing
-
Ingo RohlfingIngo Rohlfing
01 Jan 2012
01 Jan 2012

Cortical hierarchies perform Bayesian causal inference in multisensory perception.
Tim Rohe ... Uta Noppeney
PLoS biology | VOL. 13
Tim Rohe, et. al.Tim Rohe ... Uta Noppeney
24 Feb 2015
PLoS biology | VOL. 13

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Bayesian Causal Inference with Bipartite Record Linkage

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Bayesian Analysis