Identification of key contributors in complex population structures.

Markus Neuditschko,Elisabeth Jonas,Mirjam Frischknecht,Herman W Raadsma,Stefan Rieder,Mehar S Khatkar,Ruedi Von Niederhäusern,Heidi Signer-Hasler,Tosso Leeb,Eike J Steinig,Christine Flury

doi:10.1371/journal.pone.0177638

Abstract

Evaluating the genetic contribution of individuals to population structure is essential to select informative individuals for genome sequencing, genotype imputation and to ascertain complex population structures. Existing methods for the selection of informative individuals for genomic imputation solely focus on the identification of key ancestors, which can lead to a loss of phasing accuracy of the reference population. Currently many methods are independently applied to investigate complex population structures. Based on the Eigenvalue Decomposition (EVD) of a genomic relationship matrix we describe a novel approach to evaluate the genetic contribution of individuals to population structure. We combined the identification of key contributors with model-based clustering and population network visualization into an integrated three-step approach, which allows identification of high-resolution population structures and substructures around such key contributors. The approach was applied and validated in four disparate datasets including a simulated population (5,100 individuals and 10,000 SNPs), a highly structured experimental sheep population (1,421 individuals and 44,693 SNPs) and two large complex pedigree populations namely horse (1,077 individuals and 38,124 SNPs) and cattle (2,457 individuals and 45,765 SNPs). In the simulated and experimental sheep dataset, our method, which is unsupervised, successfully identified all known key contributors. Applying our three-step approach to the horse and cattle populations, we observed high-resolution population substructures including the absence of obvious important key contributors. Furthermore, we show that compared to commonly applied strategies to select informative individuals for genotype imputation including the computation of marginal gene contributions (Pedig) and the optimization of genetic relatedness (Rel), the selection of key contributors provided the highest phasing accuracies within the selected reference populations. The presented approach opens new perspectives in the characterization and informed management of populations in general, and in areas such as conservation genetics and selective animal breeding in particular, where assessing the genetic contribution of influential and admixed individuals is crucial for research and management applications. As such, this method provides a valuable complement to common applied tools to visualize complex population structures and to select individuals for re-sequencing.

Highlights

Recent innovations in high throughput sequencing [1] and array technologies [2] have led to the development of draft/reference genomes for an extensive range of domestic animal species and the identification of large numbers of single nucleotide polymorphisms (SNPs) [3,4,5]
For comparison we identified sets of individuals selected based on their pedigree-based marginal gene contributions using the program package PEDIG (PED) [12], expected genetic relationships to the reference population as presented by Goddard and Hayes (REL) [13] and animals selected at random (RAN)
The distribution of gcj illustrates that the 20 founder males were clearly identified within the F1 generation and suggests that the remaining individuals did not make a significant genetic contribution to account for the genetic variation of the population relationship structure (Fig 2A, red stars)

Summary

Introduction

Recent innovations in high throughput sequencing [1] and array technologies [2] have led to the development of draft/reference genomes for an extensive range of domestic animal species and the identification of large numbers of single nucleotide polymorphisms (SNPs) [3,4,5]. Global efforts are focusing on re-sequencing additional animals within species and breed groups to improve knowledge on the genetic architecture and allow identification of high-resolution variation between individuals [6,7,8,9]. A typical approach in such scenarios is to re-sequence informative individuals within populations, and to impute whole genome sequence level genotypes of additional animals genotyped with high density SNP panels [10, 11]

Objectives

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PLOS ONE	Publication Date: May 16, 2017
Citations: 15	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Identification of key contributors in complex population structures.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS ONE

Lead the way for us

Similar Papers

Complex and dynamic population structures: synthesis, open questions, and future directions
Joshua L Payne ... Jason H Moore
Soft Computing | VOL. 17
Joshua L Payne, et. al.Joshua L Payne ... Jason H Moore
06 Feb 2013
Soft Computing | VOL. 17

Genotyping-by-sequencing of Brassica oleracea vegetables reveals unique phylogenetic patterns, population structure and domestication footprints
Zachary Stansell ... Jonathan Fresnedo-Ramírez
Horticulture Research | VOL. 5
Zachary Stansell, et. al.Zachary Stansell ... Jonathan Fresnedo-Ramírez
01 Jul 2018
Horticulture Research | VOL. 5

Phenotypic heterogeneity in modeling cancer evolution.
Ali Mahdipour-Shirayeh ... Mohammad Kohandel
PloS one | VOL. 12
Ali Mahdipour-Shirayeh, et. al.Ali Mahdipour-Shirayeh ... Mohammad Kohandel
30 Oct 2017
PloS one | VOL. 12

Author response: Limitations of principal components in quantitative genetic association models for human studies
Yiqi Yao ... Alejandro Ochoa
-
Yiqi Yao, et. al.Yiqi Yao ... Alejandro Ochoa
25 Apr 2023
25 Apr 2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Identification of key contributors in complex population structures.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS ONE