Abstract

Integration of data from genome-wide single nucleotide polymorphism (SNP) association studies of different traits should allow researchers to disentangle the genetics of potentially related traits within individually associated regions. Formal statistical colocalisation testing of individual regions requires selection of a set of SNPs summarising the association in a region. We show that the SNP selection method greatly affects type 1 error rates, with published studies having used methods expected to result in substantially inflated type 1 error rates. We show that either avoiding variable selection and instead testing the most informative principal components or integrating over variable selection using Bayesian model averaging can help control type 1 error rates. Application to data from Graves' disease and Hashimoto's thyroiditis reveals a common genetic signature across seven regions shared between the diseases, and indicates that in five of six regions associated with Graves' disease and not Hashimoto's thyroiditis, this more likely reflects genuine absence of association with the latter rather than lack of power. Our examination, by simulation, of the performance of colocalisation tests and associated software will foster more widespread adoption of formal colocalisation testing. Given the increasing availability of large expression and genetic association datasets from disease-relevant tissue and purified cell populations, coupled with identification of regulatory sequences by projects such as ENCODE, colocalisation analysis has the potential to reveal both shared genetic signatures of related traits and causal disease genes and tissues.

Highlights

  • In recent years, genome-wide association studies (GWAS) have facilitated a dramatic increase in the number of genetic variants associated with human disease and other traits such as gene expression

  • It is well known that regression coefficients are unbiased estimates of population effects, this property does not hold after variable selection [Miller, 1984], an effect which has been referred to as “Winner’s curse” in genetics [Gring et al, 2001; Lohmueller et al, 2003]

  • Conditioning on the common causal variant rather than the most associated single nucleotide polymorphism (SNP) reduces the bias by removing the SNP selection problem, but does not eliminate it due to the overestimation of effect size (Figure 1, track C2)

Read more

Summary

Introduction

Genome-wide association studies (GWAS) have facilitated a dramatic increase in the number of genetic variants associated with human disease and other traits such as gene expression. Researchers are examining the genetic association signals from pairs of traits in parallel, with similar results interpreted as evidence that the two traits may colocalise, or share a common causal variant. These traits may be eQTL signals across two or more tissues [Dimas et al, 2009; Fairfax et al, 2012], eQTL and disease signals [Nica et al, 2010; Wallace et al, 2012] or two or more diseases [Cotsapas et al, 2011]. Dependence between genotypes at neighbouring SNPs, caused by LD, means that determination of colocalisation is not obvious, as there may exist distinct but neighbouring causal variants for each trait which are mutually associated

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.