Abstract
With the increasing availability of large-scale GWAS summary data on various traits, Mendelian randomization (MR) has become commonly used to infer causality between a pair of traits, an exposure and an outcome. It depends on using genetic variants, typically SNPs, as instrumental variables (IVs). The inverse-variance weighted (IVW) method (with a fixed-effect meta-analysis model) is most powerful when all IVs are valid; however, when horizontal pleiotropy is present, it may lead to biased inference. On the other hand, Egger regression is one of the most widely used methods robust to (uncorrelated) pleiotropy, but it suffers from loss of power. We propose a two-component mixture of regressions to combine and thus take advantage of both IVW and Egger regression; it is often both more efficient (i.e. higher powered) and more robust to pleiotropy (i.e. controlling type I error) than either IVW or Egger regression alone by accounting for both valid and invalid IVs respectively. We propose a model averaging approach and a novel data perturbation scheme to account for uncertainties in model/IV selection, leading to more robust statistical inference for finite samples. Through extensive simulations and applications to the GWAS summary data of 48 risk factor-disease pairs and 63 genetically uncorrelated trait pairs, we showcase that our proposed methods could often control type I error better while achieving much higher power than IVW and Egger regression (and sometimes than several other new/popular MR methods). We expect that our proposed methods will be a useful addition to the toolbox of Mendelian randomization for causal inference.
Highlights
Mendelian randomization (MR) has become a widely used technique to infer causal relationship between an exposure and an outcome using GWAS summary data, in which usually independent genetic variants (SNPs) are used as instrument variables (IVs) [1,2,3]
inversevariance weighted (IVW) is the most powerful under the perhaps too restrictive assumption that all IVs are valid, while Egger regression is often unnecessarily too flexible in assuming all IVs to be invalid with uncorrelated pleiotropic effects
All other MR methods used for comparison are in publicly available R packages TwoSampleMR, MendelianRandomization
Summary
Mendelian randomization (MR) has become a widely used technique to infer causal relationship between an exposure (e.g. a risk factor) and an outcome (e.g. a disease) using GWAS summary data, in which usually independent genetic variants (SNPs) are used as instrument variables (IVs) [1,2,3]. When all IVs used in Mendelian randomization are valid (and uncorrelated), the inversevariance weighted (IVW) method is consistent and most powerful: it combines the IV-specific ratio estimates most efficiently by inverse-variance weighting [4]. Combining inverse-variance weighting and Egger regression in MR using a mixture of regressions model assumption is violated and IVW may not be consistent unless the mean of the direct effects is zero, a scenario of so-called “balanced pleiotropy”. Egger regression is applied under a weaker assumption that the direct or pleiotropic effects of the genetic variants on the outcome are independent of the genetic associations with the exposure (so-called InSIDE assumption) [5]. We note that Egger regression is the only method allowing all IVs to be invalid (with possibly directional pleiotropy), and is easy to apply, both of which perhaps explain its popularity
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.