Abstract

ABSTRACTMendelian randomization analyses are often performed using summarized data. The causal estimate from a one‐sample analysis (in which data are taken from a single data source) with weak instrumental variables is biased in the direction of the observational association between the risk factor and outcome, whereas the estimate from a two‐sample analysis (in which data on the risk factor and outcome are taken from non‐overlapping datasets) is less biased and any bias is in the direction of the null. When using genetic consortia that have partially overlapping sets of participants, the direction and extent of bias are uncertain. In this paper, we perform simulation studies to investigate the magnitude of bias and Type 1 error rate inflation arising from sample overlap. We consider both a continuous outcome and a case‐control setting with a binary outcome. For a continuous outcome, bias due to sample overlap is a linear function of the proportion of overlap between the samples. So, in the case of a null causal effect, if the relative bias of the one‐sample instrumental variable estimate is 10% (corresponding to an F parameter of 10), then the relative bias with 50% sample overlap is 5%, and with 30% sample overlap is 3%. In a case‐control setting, if risk factor measurements are only included for the control participants, unbiased estimates are obtained even in a one‐sample setting. However, if risk factor data on both control and case participants are used, then bias is similar with a binary outcome as with a continuous outcome. Consortia releasing publicly available data on the associations of genetic variants with continuous risk factors should provide estimates that exclude case participants from case‐control samples.

Highlights

  • Mendelian randomization is the use of genetic variants as instrumental variables to assess and estimate the causal effect of a risk factor on an outcome from observational data (Davey Smith & Ebrahim, 2003; Burgess & Thompson, 2015)

  • With instrumental variable (IV)–risk factor associations estimated in the controls only, there was no detectable bias in the IV estimates even with extremely weak instruments, nor was there any inflation of Type 1 error rates

  • This suggests that a conventional Mendelian randomization analysis with a binary outcome in which the associations of the IV with the risk factor are only estimated in control participants provides a natural robustness against weak instrument bias, even in a one-sample setting

Read more

Summary

Introduction

Mendelian randomization is the use of genetic variants as instrumental variables to assess and estimate the causal effect of a risk factor on an outcome from observational data (Davey Smith & Ebrahim, 2003; Burgess & Thompson, 2015). A recent methodological development in Mendelian randomization is the use of summarized data on associations of genetic variants with the risk factor and with the outcome to obtain causal effect estimates (Johnson, 2013; Burgess, Butterworth, & Thompson, 2013). These summarized data comprise the associations of the individual genetic variants with the risk factor and with the outcome taken from univariable regression analyses (beta-coefficients and standard errors from linear or logistic regression as appropriate). Mendelian randomization analyses using summarized data have suggested causal effects of adiponectin on type 2 diabetes risk (Dastani et al, 2012), insulin levels on endometrial cancer risk (Nead et al, 2015), and telomere length on risk of lung adenocarcinoma (Zhang et al, 2015)

Objectives
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.