Abstract

Rare variants may help to explain some of the missing heritability of complex diseases. Technological advances in next-generation sequencing give us the opportunity to test this hypothesis. We propose two new methods (one for case-control studies and one for family-based studies) that combine aggregated rare variants and common variants located within a region through principal components analysis and allow for covariate adjustment. We analyzed 200 replicates consisting of 209 case subjects and 488 control subjects and compared the results to weight-based and step-up aggregation methods. The principal components and collapsing method showed an association between the gene FLT1 and the quantitative trait Q1 (P<10−30) in a fraction of the computation time of the other methods. The proposed family-based test has inconclusive results. The two methods provide a fast way to analyze simultaneously rare and common variants at the gene level while adjusting for covariates. However, further evaluation of the statistical efficiency of this approach is warranted.

Highlights

  • With recent technological developments in human genome sequencing, enormous numbers of rare single nucleotide variants can be detected

  • Genetic variants Of the 24,487 variants detected through sequencing of the mini-exome, 21,355 (87%) had a minor allele frequency (MAF) less than 5% and 18,131 (74%) had a MAF less than 1%

  • Case-control design Using the principal components and collapsing (PCC) case-control method, we found an association of the principal component of gene FLT1 with the quantitative trait Q1 (Figure 1a) and with the dichotomous disease phenotype (Figure 2a)

Read more

Summary

Introduction

With recent technological developments in human genome sequencing, enormous numbers of rare single nucleotide variants can be detected. This ability to measure rare variants allows researchers to investigate the multiple rare variant/common disease model, which may help to elucidate part of the missing heritability in studies of more common variants. The naive approach, which consists of testing each variant independently, has little power unless the sample sizes are large [2]. Several approaches to combining rare variants have been proposed [2,3,4,5,6,7,8,9], but few address the issue of combining rare variants with common variants. When the number of common variants within a region is large, the logistic regression model may be unstable and may not fit well

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call