Abstract

It is of great interest and potential to discover causal relationships between pairs of exposures and outcomes using genetic variants as instrumental variables (IVs) to deal with hidden confounding in observational studies. Two most popular approaches are Mendelian randomization (MR), which usually use independent genetic variants/SNPs across the genome, and transcriptome-wide association studies (TWAS) (or their generalizations) using cis-SNPs local to a gene (or some genome-wide and likely dependent SNPs), as IVs. In spite of their many promising applications, both approaches face a major challenge: the validity of their causal conclusions depends on three critical assumptions on valid IVs, and more generally on other modeling assumptions, which however may not hold in practice. The most likely as well as challenging situation is due to the wide-spread horizontal pleiotropy, leading to two of the three IV assumptions being violated and thus to biased statistical inference. More generally, we’d like to conduct a goodness-of-fit (GOF) test to check the model being used. Although some methods have been proposed as being robust to various degrees to the violation of some modeling assumptions, they often give different and even conflicting results due to their own modeling assumptions and possibly lower statistical efficiency, imposing difficulties to the practitioner in choosing and interpreting varying results across different methods. Hence, it would help to directly test whether any assumption is violated or not. In particular, there is a lack of such tests for TWAS. We propose a new and general GOF test, called TEDE (TEsting Direct Effects), applicable to both correlated and independent SNPs/IVs (as commonly used in TWAS and MR respectively). Through simulation studies and real data examples, we demonstrate high statistical power and advantages of our new method, while confirming the frequent violation of modeling (including valid IV) assumptions in practice and thus the importance of model checking by applying such a test in MR/TWAS analysis.

Highlights

  • It is of great interest in estimating and testing the causal effect of a risk factor/exposure X on an outcome Y

  • There are some methods to check the modeling assumptions for Mendelian randomization (MR) with independent genetic variants as instrumental variables (IVs), there is barely any powerful one for transcriptome-wide association studies (TWAS) with correlated single-nucleotide polymorphisms (SNPs) as IVs. We propose such a powerful method applicable to both MR and TWAS with local or genome-wide, possibly correlated/dependent, SNPs as IVs, demonstrating its higher statistical power than several commonly used methods, while confirming the frequent violation of modeling/IV assumptions in TWAS with our example GWAS data of schizophrenia, Alzheimer’s disease and blood lipids

  • An important conclusion is that in practice it is necessary to conduct model checking in MR and TWAS, and our proposed method is expected to be useful for such a task

Read more

Summary

Introduction

It is of great interest in estimating and testing the causal effect of a risk factor/exposure X on an outcome Y. For observational data, due to the presence of unmeasured confounders, say U, it is difficult to tell whether an observed association between X and Y really indicates a causal relationship. MR has been widely applied to obtain substantial findings, and one example is [1], which found significant evidence for causal relationships between many traits of interest by utilizing the large-scale UK Biobank data [2,3]. As expected, the validity of any MR analysis critically depends on its modeling assumptions, in particular including three key assumptions on valid IVs; an IV has to satisfy the following three conditions to be valid, as depicted in Fig 1A: 1. The IV is associated with the exposure X As expected, the validity of any MR analysis critically depends on its modeling assumptions, in particular including three key assumptions on valid IVs; an IV has to satisfy the following three conditions to be valid, as depicted in Fig 1A: 1. The IV is associated with the exposure X

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call