An alternative covariance estimator to investigate genetic heterogeneity in populations.

Nicolas Heslot,Jean-Luc Jannink

doi:10.1186/s12711-015-0171-z

Abstract

BackgroundFor genomic prediction and genome-wide association studies (GWAS) using mixed models, covariance between individuals is estimated using molecular markers. Based on the properties of mixed models, using available molecular data for prediction is optimal if this covariance is known. Under this assumption, adding individuals to the analysis should never be detrimental. However, some empirical studies showed that increasing training population size decreased prediction accuracy. Recently, results from theoretical models indicated that even if marker density is high and the genetic architecture of traits is controlled by many loci with small additive effects, the covariance between individuals, which depends on relationships at causal loci, is not always well estimated by the whole-genome kinship.ResultsWe propose an alternative covariance estimator named K-kernel, to account for potential genetic heterogeneity between populations that is characterized by a lack of genetic correlation, and to limit the information flow between a priori unknown populations in a trait-specific manner. This is similar to a multi-trait model and parameters are estimated by REML and, in extreme cases, it can allow for an independent genetic architecture between populations. As such, K-kernel is useful to study the problem of the design of training populations. K-kernel was compared to other covariance estimators or kernels to examine its fit to the data, cross-validated accuracy and suitability for GWAS on several datasets. It provides a significantly better fit to the data than the genomic best linear unbiased prediction model and, in some cases it performs better than other kernels such as the Gaussian kernel, as shown by an empirical null distribution. In GWAS simulations, alternative kernels control type I errors as well as or better than the classical whole-genome kinship and increase statistical power. No or small gains were observed in cross-validated prediction accuracy.ConclusionsThis alternative covariance estimator can be used to gain insight into trait-specific genetic heterogeneity by identifying relevant sub-populations that lack genetic correlation between them. Genetic correlation can be 0 between identified sub-populations by performing automatic selection of relevant sets of individuals to be included in the training population. It may also increase statistical power in GWAS.Electronic supplementary materialThe online version of this article (doi:10.1186/s12711-015-0171-z) contains supplementary material, which is available to authorized users.

Highlights

For genomic prediction and genome-wide association studies (GWAS) using mixed models, covari‐ ance between individuals is estimated using molecular markers
Many genomic prediction studies showed that the prediction accuracy of the GBLUP model decreases as more individuals are added to the training population. This problem has received considerable attention in the Heslot and Jannink Genet Sel Evol (2015) 47:93 context of prediction between breeds and, so far, empirical results obtained with the GBLUP model have been disappointing
Hayes et al [3] showed that the expected accuracies that were derived from the mixed model matched the within-breed observed accuracies but not the between-breed observed accuracies, and poor predictive ability was observed from one breed to the other

Summary

Introduction

For genomic prediction and genome-wide association studies (GWAS) using mixed models, covari‐ ance between individuals is estimated using molecular markers. Based on the properties of mixed models, using available molecular data for prediction is optimal if this covariance is known. Under this assumption, adding individu‐ als to the analysis should never be detrimental. Dawson et al [8] used historical data from international nurseries that were collected between 1992 and 2009, and reported inconsistent accuracies when they used data from previous years to predict accuracies of later years These prediction accuracies were not explained by variation in the quality of the phenotype data of the training or validation sets. Rutkoski et al [9] showed that accuracies were lower with a training population of 365 individuals than with optimized subsets of that population that were less than half its size

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Genetics, selection, evolution : GSE	Publication Date: Nov 26, 2015
Citations: 47	License type: cc-by

R Discovery Prime

R Discovery Prime

An alternative covariance estimator to investigate genetic heterogeneity in populations.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Genetics, selection, evolution : GSE

Lead the way for us

Similar Papers

Human Genetics of Obesity and Type 2 Diabetes Mellitus: Past, Present, and Future.
Erik Ingelsson ... Mark I Mccarthy
Circulation: Genomic and Precision Medicine | VOL. 11
Erik Ingelsson, et. al.Erik Ingelsson ... Mark I Mccarthy
01 Jun 2018
Circulation: Genomic and Precision Medicine | VOL. 11

Using Alternative Definitions of Controls to Increase Statistical Power in GWAS.
Sarah E Benstock ... Brad Verhulst
Behavior genetics | VOL. 54
Sarah E Benstock, et. al.Sarah E Benstock ... Brad Verhulst
13 Jun 2024
Behavior genetics | VOL. 54

Extremely low-coverage sequencing and imputation increases power for genome-wide association studies
...
Nature Genetics | VOL. 44
, et. al. ...
20 May 2012
Nature Genetics | VOL. 44

Guidelines for Evaluating the Comparability of Down-Sampled GWAS Summary Statistics
Camille M Williams ... Richard Karlsson Linnér
Behavior Genetics | VOL. 53
Camille M Williams, et. al.Camille M Williams ... Richard Karlsson Linnér
15 Sep 2023
Behavior Genetics | VOL. 53

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An alternative covariance estimator to investigate genetic heterogeneity in populations.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Genetics, selection, evolution : GSE