Abstract

With the advent of the big data era, the need to combine multiple individual data sets to draw causal effects arises naturally in many medical and biological applications. Especially each data set cannot measure enough confounders to infer the causal effect of an exposure on an outcome. In this article, we extend the method proposed by a previous study to causal data fusion of more than two data sets without external validation and to a more general (continuous or discrete) exposure and outcome. Theoretically, we obtain the condition for identifiability of exposure effects using multiple individual data sources for the continuous or discrete exposure and outcome. The simulation results show that our proposed causal data fusion method has unbiased causal effect estimate and higher precision than traditional regression, meta-analysis and statistical matching methods. We further apply our method to study the causal effect of BMI on glucose level in individuals with diabetes by combining two data sets. Our method is essential for causal data fusion and provides important insights into the ongoing discourse on the empirical analysis of merging multiple individual data sources.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call