Abstract

The 1000 Genomes Project provides a unique source of whole genome sequencing data for studies of human population genetics and human diseases. The last release of this project includes more than 2,500 sequenced individuals from 26 populations. Although relationships among individuals have been investigated in some of the populations, inbreeding has never been studied. In this article, we estimated the genomic inbreeding coefficient of each individual and found an unexpected high level of inbreeding in 1000 Genomes data: nearly a quarter of the individuals were inbred and around 4% of them had inbreeding coefficients similar or greater than the ones expected for first-cousin offspring. Inbred individuals were found in each of the 26 populations, with some populations showing proportions of inbred individuals above 50%. We also detected 227 previously unreported pairs of close relatives (up to and including first-cousins). Thus, we propose subsets of unrelated and outbred individuals, for use by the scientific community. In addition, because admixed populations are present in the 1000 Genomes Project, we performed simulations to study the robustness of inbreeding coefficient estimates in the presence of admixture. We found that our multi-point approach (FSuite) was quite robust to admixture, unlike single-point methods (PLINK).

Highlights

  • The goal of this article is to describe the inbreeding patterns in the 26 populations of the final phase of this panel by using the genotype data obtained from the sequencing

  • Different approaches have been developed to estimate this coefficient from the genotype data of an individual without known genealogy, and can be classified in 2 main categories

  • We have shown that multi-point approaches provide reliable estimates of the genomic inbreeding coefficient f even when there are some admixed individuals in the studied population

Read more

Summary

Introduction

The goal of this article is to describe the inbreeding patterns in the 26 populations of the final phase of this panel by using the genotype data obtained from the sequencing. Some of the TGP populations are known to be admixed (AMR panel and ASW population), i.e. to have ancestry from different populations. It has been shown that single-point methods to estimate kinship and inbreeding coefficients are biased in presence of admixture in the population[12,13] this is less clear for multi-point methods[14]. For this reason, before estimating inbreeding in TGP populations, we investigated the accuracy of FSuite f estimates on admixed individuals by simulation

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call