Abstract

Modeling covariance structure based on genetic similarity between pairs of relatives plays an important role in evolutionary, quantitative and statistical genetics. Historically, genetic similarity between individuals has been quantified from pedigrees via the probability that randomly chosen homologous alleles between individuals are identical by descent (IBD). At present, however, many genetic analyses rely on molecular markers, with realized measures of genomic similarity replacing IBD-based expected similarities. Animal and plant breeders, for example, now employ marker-based genomic relationship matrices between individuals in prediction models and in estimation of genome-based heritability coefficients. Phenotypes convey information about genetic similarity as well. For instance, if phenotypic values are at least partially the result of the action of quantitative trait loci, one would expect the former to inform about the latter, as in genome-wide association studies. Statistically, a non-trivial conditional distribution of unknown genetic similarities, given phenotypes, is to be expected. A Bayesian formalism is presented here that applies to whole-genome regression methods where some genetic similarity matrix, e.g., a genomic relationship matrix, can be defined. Our Bayesian approach, based on phenotypes and markers, converts prior (markers only) expected similarity into trait-specific posterior similarity. A simulation illustrates situations under which effective Bayesian learning from phenotypes occurs. Pinus and wheat data sets were used to demonstrate applicability of the concept in practice. The methodology applies to a wide class of Bayesian linear regression models, it extends to the multiple-trait domain, and can also be used to develop phenotype-guided similarity kernels in prediction problems.

Highlights

  • Assessing genetic or genomic similarity among species or individuals is a central topic in evolutionary and quantitative genetics (Lynch and Walsh, 1998; Walsh and Lynch, 2018). Sethuraman (2018) reviewed areas where estimation of genetic relatedness is important, including paternity and maternity assignments, forensic, association and linkage studies, and inference and prediction in quantitative genetics.Until recently, many quantitative-trait analyses such as estimation of genetic variances and covariances and prediction of unobservable genotypic values, have relied on modeling covariances based on pedigree-based genetic relatedness between relatives; such covariances enter into the dispersion structure of mixed effects and Bayesian models

  • Frobenius distances based on posterior draws were closer to similarity at the quantitative trait loci (QTL) level than those based on samples from the prior; further, the posterior distribution of distances was sharper than the prior distribution

  • G (β) is expected to converge to GQ, the genetic similarity at the QTL level, since QTL genotypes are included in the marker panel

Read more

Summary

Introduction

Assessing genetic or genomic similarity among species or individuals is a central topic in evolutionary and quantitative genetics (Lynch and Walsh, 1998; Walsh and Lynch, 2018). Sethuraman (2018) reviewed areas where estimation of genetic relatedness is important, including paternity and maternity assignments, forensic, association and linkage studies, and inference and prediction in quantitative genetics. Many quantitative-trait analyses such as estimation of genetic variances and covariances and prediction of unobservable genotypic values (e.g., breeding values in animal and plant breeding), have relied on modeling covariances based on pedigree-based genetic relatedness between relatives;. Such covariances enter into the dispersion structure of mixed effects and Bayesian models. One representation assumed Hardy–Weinberg and linkage equilibrium among markers; the latter assumption is manifestly violated in plant and animal breeding data In another approach (Wang et al, 2012), the weights of Zhang et al (2010) were applied iteratively. It is shown how the concept can be adapted to prediction in a training–testing setting, and multiple-trait extensions are suggested

Linear regression connecting markers to phenotypes
Definition of genomic similarity matrices
Implementation-specific similarity matrices
Bayes A
Learning similarity from multiple-trait models
Setting
Results
Pinus taeda data
Wheat data
Discussion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call