Abstract

In many species, spatial genetic variation displays patterns of “isolation-by-distance.” Characterized by locally correlated allele frequencies, these patterns are known to create periodic shapes in geographic maps of principal components which confound signatures of specific migration events and influence interpretations of principal component analyses (PCA). In this study, we introduced models combining probabilistic PCA and kriging models to infer population genetic structure from genetic data while correcting for effects generated by spatial autocorrelation. The corresponding algorithms are based on singular value decomposition and low rank approximation of the genotypic data. As their complexity is close to that of PCA, these algorithms scale with the dimensions of the data. To illustrate the utility of these new models, we simulated isolation-by-distance patterns and broad-scale geographic variation using spatial coalescent models. Our methods remove the horseshoe patterns usually observed in PC maps and simplify interpretations of spatial genetic variation. We demonstrate our approach by analyzing single nucleotide polymorphism data from the Human Genome Diversity Panel, and provide comparisons with other recently introduced methods.

Highlights

  • The concept of “isolation-by-distance” (IBD) was introduced by S

  • We report that the new spatial factor analysis (spFA) method was able to remove the horseshoe effect observed in spatially structured data, whereas this was not the case in principal component analyses (PCA), spatial PCA (sPCA), and Sparse Factor Analysis (SFA) analyses

  • We evaluated the effects of IBD patterns on inference of population genetic structure using 4 statistical methods: Principal Component Analysis (PCA, Jolliffe, 1986; Patterson et al, 2006), spatial PCA, Sparse Factor Analysis (SFA, Engelhardt and Stephens, 2010), and a new method called spatial Factor Analysis

Read more

Summary

Introduction

The concept of “isolation-by-distance” (IBD) was introduced by S. Wright to describe the accumulation of local genetic differences under spatially restricted dispersal (Wright, 1943). In species that are continuously distributed in geographic space and disperse over short distances, the theory predicts that genetic differentiation will increase with geographic distance (Malécot, 1948; Kimura and Weiss, 1964). IBD can be described by spatial autocorrelation, a measure of the degree of dependency among observations in a geographic space. Studying IBD patterns could lead to useful estimates of gene dispersal (Rousset, 1997), spatial autocorrelation derived from IBD often presents a problem for population genetic analyses. The presence of spatial autocorrelation patterns can increase the rate of false positive tests for hierarchical population structure or for the detection of loci under selection (Meirmans, 2012)

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.