Abstract

With the advent of the big data era, the need to combine multiple individual data sets to draw causal effects arises naturally in many medical and biological applications. Especially each data set cannot measure enough confounders to infer the causal effect of an exposure on an outcome. In this article, we extend the method proposed by a previous study to causal data fusion of more than two data sets without external validation and to a more general (continuous or discrete) exposure and outcome. Theoretically, we obtain the condition for identifiability of exposure effects using multiple individual data sources for the continuous or discrete exposure and outcome. The simulation results show that our proposed causal data fusion method has unbiased causal effect estimate and higher precision than traditional regression, meta-analysis and statistical matching methods. We further apply our method to study the causal effect of BMI on glucose level in individuals with diabetes by combining two data sets. Our method is essential for causal data fusion and provides important insights into the ongoing discourse on the empirical analysis of merging multiple individual data sources.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.