The impact of genetic structure on sequencing analysis

Sneha Jadhav,Qing Lu,Xiaoran Tong,Olga A Vsevolozhskaya

doi:10.1186/s12919-016-0025-x

Abstract

BackgroundGenome-wide association studies have made substantial progress in identifying common variants associated with human diseases. Despite such success, a large portion of heritability remains unexplained. Evolutionary theory and empirical studies suggest that rare mutations could play an important role in human diseases, which motivates comprehensive investigation of rare variants in sequencing studies. To explore the association of rare variants with human diseases, many statistical approaches have been developed with different ways of modeling genetic structure (ie, linkage disequilibrium). Nevertheless, the appropriate strategy to model genetic structure of sequencing data and its effect on association analysis have not been well studied.MethodsWe investigate 3 statistical approaches that use 3 different strategies to model the genetic structure of sequencing data. We proceed by comparing a burden test that assumes independence among sequencing variants, a burden test that considers pairwise linkage disequilibrium (LD), and a functional analysis of variance (FANOVA) test that models genetic data through fitting continuous curves on individuals’ genotypes.ResultsThrough simulations, we find that FANOVA attains better or comparable performance to the 2 burden tests. Overall, the burden test that considers pairwise LD has comparable performance to the burden test that assumes independence between sequencing variants. However, for 1 gene, where the disease-associated variant is located in an LD block, we find that considering pairwise LD could improve the test’s performance.ConclusionsThe structure of sequencing variants is complex in nature and its patterns vary across the whole genome. In certain cases (eg, a disease-susceptibility variant is in an LD block), ignoring the genetic structure in the association analysis could result in suboptimal performance. Through this study, we show that a functional-based method is promising for modeling the underlying genetic structure of sequencing data, which could lead to better performance.

Highlights

Genome-wide association studies have made substantial progress in identifying common variants associated with human diseases
The emerging sequencing data facilitates the study of massive amounts of single nucleotide variants (SNVs), including both rare and common variants, for their potential role in complex human diseases
Various statistical methods have been proposed to group SNVs with or without considering the underlying genetic structure

Summary

Introduction

Genome-wide association studies have made substantial progress in identifying common variants associated with human diseases. Despite such success, a large portion of heritability remains unexplained. To explore the association of rare variants with human diseases, many statistical approaches have been developed with different ways of modeling genetic structure (ie, linkage disequilibrium). The emerging sequencing data facilitates the study of massive amounts of single nucleotide variants (SNVs), including both rare and common variants, for their potential role in complex human diseases. These studies hold great promise for identification of new disease-susceptibility variants, the extremely large number of SNVs brings. As an initial step to investigate this issue, we chose 3 tests with different ways of modeling LD between SNVs: (a) a weighted burden test assuming independence among SNVs (BT) [3]; (b) a weighted burden test considering pairwise LD (BTCOV) [4]; and (c) a functional analysis of variance (FANOVA) [5] test that considers LD among nearby loci and models the genotype profile of an individual as a continuous function

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Proceedings	Publication Date: Oct 1, 2016
Citations: 2	License type: cc-by

R Discovery Prime

R Discovery Prime

The impact of genetic structure on sequencing analysis

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Proceedings

Lead the way for us

Similar Papers

Sequence Kernel Association Tests for the Combined Effect of Rare and Common Variants
Iuliana Ionita-Laza ... Xihong Lin
The American Journal of Human Genetics | VOL. 92
Iuliana Ionita-Laza, et. al.Iuliana Ionita-Laza ... Xihong Lin
16 May 2013
The American Journal of Human Genetics | VOL. 92

Polymorphisms in chromosome 9 and risk of ischemic stroke in two European white populations, and a meta-analysis.
A Di Castelnuovo ... L Iacoviello
Journal of Thrombosis and Haemostasis | VOL. 7
A Di Castelnuovo, et. al.A Di Castelnuovo ... L Iacoviello
25 Nov 2008
Journal of Thrombosis and Haemostasis | VOL. 7

SNPs, Haplotypes, and Cancer: Applications in Molecular Epidemiology
Timothy R Rebbeck ... Fred F Kadlubar
Cancer Epidemiology, Biomarkers & Prevention | VOL. 13
Timothy R Rebbeck, et. al.Timothy R Rebbeck ... Fred F Kadlubar
01 May 2004
Cancer Epidemiology, Biomarkers & Prevention | VOL. 13

Abstract 29: Integrating computational epigenetic and statistical approaches to investigate how genome-wide transcription factor (TF)-DNA bindings affect breast cancer risk
Wanqing Wen ... Wei Zheng
Cancer Research | VOL. 80
Wanqing Wen, et. al.Wanqing Wen ... Wei Zheng
13 Aug 2020
Cancer Research | VOL. 80

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

The impact of genetic structure on sequencing analysis

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Proceedings