Abstract

Although genome-wide association studies (GWASs) have identified numerous loci associated with complex traits, imprecise modeling of the genetic relatedness within study samples may cause substantial inflation of test statistics and possibly spurious associations. Variance component approaches, such as efficient mixed-model association (EMMA), can correct for a wide range of sample structures by explicitly accounting for pairwise relatedness between individuals, using high-density markers to model the phenotype distribution; but such approaches are computationally impractical. We report here a variance component approach implemented in publicly available software, EMMA eXpedited (EMMAX), that reduces the computational time for analyzing large GWAS data sets from years to hours. We apply this method to two human GWAS data sets, performing association analysis for ten quantitative traits from the Northern Finland Birth Cohort and seven common diseases from the Wellcome Trust Case Control Consortium. We find that EMMAX outperforms both principal component analysis and genomic control in correcting for sample structure.

Highlights

  • The relative impact of insertion and deletion variants and single‐nucleotide polymorphisms (SNPs) on human complex disease risk is unclear

  • Using association results from the SardiNIA cohort of up to 5,949 individuals for 120 traits, we did not find evidence of common indels more likely to be potentially causal than SNPs with regard to associations to complex traits

  • On a similar note but looking at only the coding sequence, Montgomery et al (2013) did not find direct evidence that potentially causal classes of coding indels are enriched for associations compared with known disease‐associated SNPs present in the genome wide association studies (GWAS) Catalog

Read more

Summary

| INTRODUCTION

The relative impact of insertion and deletion variants (indels) and single‐nucleotide polymorphisms (SNPs) on human complex disease risk is unclear. SNPs in coding and noncoding regions have been implicated in both Mendelian and complex disease, and the same is true for indels. An insertion or deletion that is not in‐frame (a multiple of three base pairs) will alter the reading frame resulting in a new set of amino acids and a protein product that differs to the wild type. Even in‐frame indels (insertions or deletions of three or multiples of three base pairs) in the coding sequence can result in altered proteins. The cumulative contribution of indels compared with SNPs to disease risk has not been thoroughly investigated. Acknowledging that this work lacks insight from rare variation, it begins to move toward a better understanding of which type of polymorphism is more likely to impact human health and to quantify the gain by routinely including indels in genome wide association studies (GWAS)

| RESULTS
| DISCUSSION
Findings
| METHODS
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.