Abstract

Family data and rare variants are two key features of whole genome sequencing analysis for hunting the missing heritability of common human diseases. Recently, Zhu and Xiong proposed the generalized T2 tests that combine rare variant analysis and family data analysis. In similar fashion, we developed the extended T2 tests for longitudinal whole genome sequencing data for family-based association studies. The new methods simultaneously incorporate three correlation sources: from linkage disequilibrium, from pedigree structure, and from the repeated measures of covariates. We assess and compare these methods using the simulated data from Genetic Analysis Workshop 18. We show that, in general, the extended T2 tests incorporating longitudinal repeated measures have higher power than the single-time-point T2 tests in detecting hypertension-associated genome segments.

Highlights

  • Compared with traditional genome-wide association studies (GWAS), whole genome sequencing (WGS) is a more efficient way of finding the missing heritability of diseases [1]

  • While GWAS are mostly based on microarray genotyping, which can discover only common genetic variants, WGS is able to reveal rare and structural variants, which are crucial factors behind disease phenotypes [2]

  • We further extended the methodology of the T2 tests to address these limits. By applying these methods to an analysis of the Genetic Analysis Workshop 18 (GAW18) simulation data, we showed that the asymptotic null distributions of Zhu and Xiong [8] are problematic in controlling the type I error rate, and that our extended methods are likely more powerful for longitudinal data

Read more

Summary

Introduction

Compared with traditional genome-wide association studies (GWAS), whole genome sequencing (WGS) is a more efficient way of finding the missing heritability of diseases [1]. While GWAS are mostly based on microarray genotyping, which can discover only common genetic variants, WGS is able to reveal rare and structural variants, which are crucial factors behind disease phenotypes [2]. Most of the recent discoveries from sequencing studies were based on the Mendelian trait model [3]. Genetic association studies based on the complex trait model are challenging because of limited sample size as well as the new properties of sequencing data. Because of its relatively high cost, WGS tends to exploit families of patients, so that the rare causal variants are likely enriched through cotransmission of the disease [5]. The pedigree structure allows statistical imputation of the genotypes at no experimental cost, which potentially increases the statistical power [6,7]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call