A causal data fusion method for the general exposure and outcome.

Hongkai Li,Zhi Geng,Jinzhu Jia,Ran Yan,Fuzhong Xue

doi:10.1002/sim.9239

Abstract

With the advent of the big data era, the need to combine multiple individual data sets to draw causal effects arises naturally in many medical and biological applications. Especially each data set cannot measure enough confounders to infer the causal effect of an exposure on an outcome. In this article, we extend the method proposed by a previous study to causal data fusion of more than two data sets without external validation and to a more general (continuous or discrete) exposure and outcome. Theoretically, we obtain the condition for identifiability of exposure effects using multiple individual data sources for the continuous or discrete exposure and outcome. The simulation results show that our proposed causal data fusion method has unbiased causal effect estimate and higher precision than traditional regression, meta-analysis and statistical matching methods. We further apply our method to study the causal effect of BMI on glucose level in individuals with diabetes by combining two data sets. Our method is essential for causal data fusion and provides important insights into the ongoing discourse on the empirical analysis of merging multiple individual data sources.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A causal data fusion method for the general exposure and outcome.

Abstract

Talk to us

Similar Papers

More From: Statistics in medicine

Lead the way for us

Similar Papers

Strategy for Combining Information from Real World Data Sources When Individual Patient Data Are Not Simultaneously Accessible
Xiwu Lin ... Hui Quan
Statistics in Biopharmaceutical Research | VOL. 14
Xiwu Lin, et. al.Xiwu Lin ... Hui Quan
13 May 2022
Statistics in Biopharmaceutical Research | VOL. 14

Investigation of automated feature extraction using multiple data sources
Neal R Harvey ... Belur V Dasarathy
-
Neal R Harvey, et. al.Neal R Harvey ... Belur V Dasarathy
02 Apr 2003
02 Apr 2003

Ontology-Based Searching Over Multiple Networked Data Sources
Liang Xue ... Boqin Feng
-
Liang Xue, et. al.Liang Xue ... Boqin Feng
01 Jan 2004
01 Jan 2004

Identifying biologically relevant genes via multiple heterogeneous data sources
Zheng Zhao ... Jieping Ye
-
Zheng Zhao, et. al.Zheng Zhao ... Jieping Ye
24 Aug 2008
24 Aug 2008

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A causal data fusion method for the general exposure and outcome.

Abstract

Talk to us

Similar Papers

More From: Statistics in medicine