Estimating FST and kinship for arbitrary population structures.

Alejandro Ochoa,John D Storey

doi:10.1371/journal.pgen.1009241

Alejandro Ochoa, John D Storey

Open Access

PDF Available

https://doi.org/10.1371/journal.pgen.1009241

Copy DOI

Export

Save

Cite

Journal: PLOS Genetics	Publication Date: Jan 19, 2021
Citations: 67	License type: CC BY 4.0

Affiliation: Duke University, Princeton University

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

FST and kinship are key parameters often estimated in modern population genetics studies in order to quantitatively characterize structure and relatedness. Kinship matrices have also become a fundamental quantity used in genome-wide association studies and heritability estimation. The most frequently-used estimators of FST and kinship are method-of-moments estimators whose accuracies depend strongly on the existence of simple underlying forms of structure, such as the independent subpopulations model of non-overlapping, independently evolving subpopulations. However, modern data sets have revealed that these simple models of structure likely do not hold in many populations, including humans. In this work, we analyze the behavior of these estimators in the presence of arbitrarily-complex population structures, which results in an improved estimation framework specifically designed for arbitrary population structures. After generalizing the definition of FST to arbitrary population structures and establishing a framework for assessing bias and consistency of genome-wide estimators, we calculate the accuracy of existing FST and kinship estimators under arbitrary population structures, characterizing biases and estimation challenges unobserved under their originally-assumed models of structure. We then present our new approach, which consistently estimates kinship and FST when the minimum kinship value in the dataset is estimated consistently. We illustrate our results using simulated genotypes from an admixture model, constructing a one-dimensional geographic scenario that departs nontrivially from the independent subpopulations model. Our simulations reveal the potential for severe biases in estimates of existing approaches that are overcome by our new framework. This work may significantly improve future analyses that rely on accurate kinship and FST estimates.

Highlights

In population genetics studies, one is often interested in characterizing structure, genetic differentiation, and relatedness among individuals
We find that existing FST estimators are downwardly biased, and that existing kinship matrix estimators have related biases that are on average downward and of similar magnitude but vary for every pair of individuals
These insights led us to a new estimation framework for kinship and FST that is practically unbiased for any population structure, as demonstrated by theory and simulations

Summary

Introduction

One is often interested in characterizing structure, genetic differentiation, and relatedness among individuals. FST is the probability that alleles drawn randomly from a subpopulation are “identical by descent” (IBD) relative to an ancestral population [1, 2]. The kinship coefficient is a measure of relatedness between individuals defined in terms of IBD probabilities, and it is closely related to FST, since the mean kinship of the parents in a subpopulation is the FST of the following generation [1]. There are many likelihood approaches that estimate FST and kinship, but these are limited by assuming independent subpopulations or Normal approximations for FST [3,4,5,6,7,8,9,10,11] or totally outbred individuals for kinship [12, 13]. Non-parametric approaches such as those based on the method of moments are considerably more flexible and computationally tractable [16], so they are the natural choice to study arbitrary population structures

Objectives

Methods

Results

Discussion

Conclusion

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

Estimating FST and kinship for arbitrary population structures.

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: PLOS Genetics

Lead the way for us

Similar Papers

Author response: Limitations of principal components in quantitative genetic association models for human studies
Yiqi Yao ... Alejandro Ochoa
-
Yiqi Yao, et. al.Yiqi Yao ... Alejandro Ochoa
25 Apr 2023
25 Apr 2023

Decision letter: Limitations of principal components in quantitative genetic association models for human studies
Magnus Nordborg ... Detlef Weigel
-
Magnus Nordborg, et. al.Magnus Nordborg ... Detlef Weigel
04 Jul 2022
04 Jul 2022

Editor's evaluation: Limitations of principal components in quantitative genetic association models for human studies
Magnus Nordborg
-
Magnus NordborgMagnus Nordborg
04 Jul 2022
04 Jul 2022

Pervasive Downward Bias in Estimates of Liability-Scale Heritability in Genome-wide Association Study Meta-analysis: A Simple Solution
Andrew D Grotzinger ... Elliot M Tucker-Drob
Biological Psychiatry | VOL. 93
Andrew D Grotzinger, et. al.Andrew D Grotzinger ... Elliot M Tucker-Drob
08 Jun 2022
Biological Psychiatry | VOL. 93

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Estimating FST and kinship for arbitrary population structures.

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: PLOS Genetics