Abstract

One of the main caveats of association studies is the possible affection by bias due to population stratification. Existing methods rely on model-based approaches like structure and ADMIXTURE or on principal component analysis like EIGENSTRAT. Here we provide a novel visualization technique and describe the problem of population substructure from a graph-theoretical point of view. We group the sequenced individuals into triads, which depict the relational structure, on the basis of a predefined pairwise similarity measure. We then merge the triads into a network and apply community detection algorithms in order to identify homogeneous subgroups or communities, which can further be incorporated as covariates into logistic regression. We apply our method to populations from different continents in the 1000 Genomes Project and evaluate the type 1 error based on the empirical p-values. The application to 1000 Genomes data suggests that the network approach provides a very fine resolution of the underlying ancestral population structure. Besides we show in simulations, that in the presence of discrete population structures, our developed approach maintains the type 1 error more precisely than existing approaches.

Highlights

  • IntroductionGenome wide association studies (GWAS) have shown to be a powerful analytical tool in association mapping to identify common variants (i.e. variants with minor allele frequencies of more than 1%) that contribute to the heritability of complex diseases[1]

  • Genome wide association studies (GWAS) have shown to be a powerful analytical tool in association mapping to identify common variants that contribute to the heritability of complex diseases[1]

  • We have proposed a new approach for population structure inference, which is based on network methodology

Read more

Summary

Introduction

Genome wide association studies (GWAS) have shown to be a powerful analytical tool in association mapping to identify common variants (i.e. variants with minor allele frequencies of more than 1%) that contribute to the heritability of complex diseases[1]. A large number of GWAS for various complex diseases have been conducted up to date, for most complex diseases, the proportion of the estimated variance that can be explained by common variation is rather low [4,5,6]. Besides alternative factors that are likely to contribute to the estimated heritability of a trait like gene-gene interactions, variants with low frequencies, PLOS ONE | DOI:10.1371/journal.pone.0130708. Besides alternative factors that are likely to contribute to the estimated heritability of a trait like gene-gene interactions, variants with low frequencies, PLOS ONE | DOI:10.1371/journal.pone.0130708 June 22, 2015

Objectives
Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.