Abstract

BackgroundGenomic models that link phenotypes to dense genotype information are increasingly being used for infering variance parameters in genetics studies. The variance parameters of these models can be inferred using restricted maximum likelihood, which produces consistent, asymptotically normal estimates of variance components under the true model. These properties are not guaranteed to hold when the covariance structure of the data specified by the genomic model differs substantially from the covariance structure specified by the true model, and in this case, the likelihood of the model is said to be misspecified. If the covariance structure specified by the genomic model provides a poor description of that specified by the true model, the likelihood misspecification may lead to incorrect inferences.ResultsThis work provides a theoretical analysis of the genomic models based on splitting the misspecified likelihood equations into components, which isolate those that contribute to incorrect inferences, providing an informative measure, defined as varvec{kappa }, to compare the covariance structure of the data specified by the genomic and the true models. This comparison of the covariance structures allows us to determine whether or not bias in the variance components estimates is expected to occur.ConclusionsThe theory presented can be used to provide an explanation for the success of a number of recently reported approaches that are suggested to remove sources of bias of heritability estimates. Furthermore, however complex is the quantification of this bias, we can determine that, in genomic models that consider a single genomic component to estimate heritability (assuming SNP effects are all i.i.d.), the bias of the estimator tends to be downward, when it exists.

Highlights

  • Genomic models that link phenotypes to dense genotype information are increasingly being used for infering variance parameters in genetics studies

  • We define a genomic model as any linear mixed model (LMM) that links a phenotype to multiple genotypes without knowledge of those that are associated with the phenotype

  • Misspecification of the likelihood is due to the difference between the covariance structures of the data specified by the misspecified and true models (G and GQ ), and our study shows that the bias of restricted maximum likelihood (REML) estimators of variance parameters is linked to the relationship between the eigen-values and eigen-vectors of both models, occurring when κi =

Read more

Summary

Introduction

Genomic models that link phenotypes to dense genotype information are increasingly being used for infering variance parameters in genetics studies. The variance parameters of these models can be inferred using restricted maximum likelihood, which produces consistent, asymptotically normal estimates of variance components under the true model. If the covariance structure specified by the genomic model provides a poor description of that specified by the true model, the likelihood misspecification may lead to incorrect inferences. The correct covariance structure (referred to in our work as GQ ) requires knowledge of the QTL Since these are typically unknown, in practice, the genomic model makes use of the available SNP genotypes instead in order to compute a covariance structure (referred to in our work as G ), leading to misspecification of the likelihood. G may provide a poor description of GQ , and the likelihood misspecification may lead to biased estimators of variance parameters

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call