Probabilistic models of genetic variation in structured populations applied to global human studies.

Wei Hao,John D Storey,Minsun Song

doi:10.1093/bioinformatics/btv641

Wei Hao, John D Storey + Show 1 more

Open Access

https://doi.org/10.1093/bioinformatics/btv641

Copy DOI

Abstract

Motivation: Modern population genetics studies typically involve genome-wide genotyping of individuals from a diverse network of ancestries. An important problem is how to formulate and estimate probabilistic models of observed genotypes that account for complex population structure. The most prominent work on this problem has focused on estimating a model of admixture proportions of ancestral populations for each individual. Here, we instead focus on modeling variation of the genotypes without requiring a higher-level admixture interpretation. Results: We formulate two general probabilistic models, and we propose computationally efficient algorithms to estimate them. First, we show how principal component analysis can be utilized to estimate a general model that includes the well-known Pritchard–Stephens–Donnelly admixture model as a special case. Noting some drawbacks of this approach, we introduce a new ‘logistic factor analysis’ framework that seeks to directly model the logit transformation of probabilities underlying observed genotypes in terms of latent variables that capture population structure. We demonstrate these advances on data from the Human Genome Diversity Panel and 1000 Genomes Project, where we are able to identify SNPs that are highly differentiated with respect to structure while making minimal modeling assumptions. Availability and Implementation: A Bioconductor R package called lfa is available at http://www.bioconductor.org/packages/release/bioc/html/lfa.html. Contact: jstorey@princeton.edu Supplementary information: Supplementary data are available at Bioinformatics online.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Bioinformatics	Publication Date: Nov 6, 2015
Citations: 68	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Probabilistic models of genetic variation in structured populations applied to global human studies.

Abstract

Talk to us

Similar Papers

More From: Bioinformatics

Lead the way for us

Similar Papers

Author response: Limitations of principal components in quantitative genetic association models for human studies
Yiqi Yao ... Alejandro Ochoa
-
Yiqi Yao, et. al.Yiqi Yao ... Alejandro Ochoa
25 Apr 2023
25 Apr 2023

Decision letter: Limitations of principal components in quantitative genetic association models for human studies
Magnus Nordborg ... Detlef Weigel
-
Magnus Nordborg, et. al.Magnus Nordborg ... Detlef Weigel
04 Jul 2022
04 Jul 2022

Editor's evaluation: Limitations of principal components in quantitative genetic association models for human studies
Magnus Nordborg
-
Magnus NordborgMagnus Nordborg
04 Jul 2022
04 Jul 2022

Turkish Population Structure and Genetic Ancestry Reveal Relatedness among Eurasian Populations
Uğur Hodoğlugil ... Robert W Mahley
Annals of Human Genetics | VOL. 76
Uğur Hodoğlugil, et. al.Uğur Hodoğlugil ... Robert W Mahley
15 Feb 2012
Annals of Human Genetics | VOL. 76

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Probabilistic models of genetic variation in structured populations applied to global human studies.

Abstract

Talk to us

Similar Papers

More From: Bioinformatics