Abstract

There is great interest in understanding the impact of rare variants in human diseases using large sequence datasets. In deep sequence datasets of >10,000 samples, ~10% of the variant sites are observed to be multi-allelic. Many of the multi-allelic variants have been shown to be functional and disease-relevant. Proper analysis of multi-allelic variants is critical to the success of a sequencing study, but existing methods do not properly handle multi-allelic variants and can produce highly misleading association results. We discuss practical issues and methods to encode multi-allelic sites, conduct single-variant and gene-level association analyses, and perform meta-analysis for multi-allelic variants. We evaluated these methods through extensive simulations and the study of a large meta-analysis of ~18,000 samples on the cigarettes-per-day phenotype. We showed that our joint modeling approach provided an unbiased estimate of genetic effects, greatly improved the power of single-variant association tests among methods that can properly estimate allele effects, and enhanced gene-level tests over existing approaches. Software packages implementing these methods are available online.

Highlights

  • Rare genetic variants are enriched with functional alleles that play an important role in a variety of complex human diseases, including hematological disorders [1], coronary artery disease [2,3,4], and others

  • Simulations indicated that jointly modeling the allelic effects of multiple alternative alleles leads to more powerful single-variant association tests (Table 2)

  • Multi-allelic variants have been largely ignored in the GWAS era due to the extensive use of common bi-allelic single-nucleotide polymorphisms (SNPs) as markers to tag regions that harbor causal variants

Read more

Summary

Introduction

Rare genetic variants are enriched with functional alleles that play an important role in a variety of complex human diseases, including hematological disorders [1], coronary artery disease [2,3,4], and others. The discovery of such rare-variant associations has contributed significantly to the generation of new mechanistic insights and the identification of novel therapeutic targets [4,5]. Despite the importance of multi-allelic variants, most of the methods developed so far for sequence-based association analysis consider only bi-allelic variants, and do not properly handle multi-allelic sites [7,8]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call