Abstract

Genome-wide association (GWA) studies have become a standard approach for discovering and validating genomic polymorphisms putatively associated with phenotypes of interest. Accounting for population structure in GWA studies is critical to attain unbiased parameter measurements and control Type I error. One common approach to accounting for population structure is to include several principal components derived from the entire autosomal dataset, which reflects population structure signal. However, knowing which components to include is subjective and generally not conclusive. We examined how phylogenetic signal from mitochondrial DNA (mtDNA) and chromosome Y (chr:Y) markers is concordant with principal component data based on autosomal markers to determine whether mtDNA and chr:Y phylogenetic data can help guide principal component selection. Using HAPMAP and other original data from individuals of multiple ancestries, we examined the relationships of mtDNA and chr:Y phylogenetic signal with the autosomal PCA using best subset logistic regression. We show that while the two approaches agree at times, this is independent of the component order and not completely represented in the Eigen values. Additionally, we use simulations to demonstrate that our approach leads to a slightly reduced Type I error rate compared to the standard approach. This approach provides preliminary evidence to support the theoretical concept that mtDNA and chr:Y data can be informative in locating the PCs that are most associated with evolutionary history of populations that are being studied, although the utility of such information will depend on the specific situation.

Highlights

  • Genome-wide association (GWA) studies have become common practice in the effort to elucidate relationships of genetic markers to many human diseases and phenotypes

  • We examined whether a phylogeny based on mitochondrial DNA (mtDNA) and chr:Y markers could contribute toward correction of population stratification

  • We sought to determine (a) whether a phylogenetic analysis of mtDNA and chr:Y data could help further define the population structure based solely on principal component analysis (PCA) of autosomal SNPs, and (b) which principal components should be included to account for population structure

Read more

Summary

Introduction

Genome-wide association (GWA) studies have become common practice in the effort to elucidate relationships of genetic markers to many human diseases and phenotypes. There has been a discussion of whether population stratification information should be included as random or fixed effects in subsequent analyses (Price et al, 2010), but either approach relies on the components accounting for the population sub-structure. Some statistical approaches, such the Bayesian LASSO (De Los Campos et al, 2009), which include all the markers simultaneously, implicitly incorporate information pertaining to population structure in a manner similar the PCA approach, in a regularized manner. The literature is quite rich in extensions www.frontiersin.org

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call