Abstract

BackgroundA random multiple-regression model that simultaneously fit all allele substitution effects for additive markers or haplotypes as uncorrelated random effects was proposed for Best Linear Unbiased Prediction, using whole-genome data. Leave-one-out cross validation can be used to quantify the predictive ability of a statistical model.MethodsNaive application of Leave-one-out cross validation is computationally intensive because the training and validation analyses need to be repeated n times, once for each observation. Efficient Leave-one-out cross validation strategies are presented here, requiring little more effort than a single analysis.ResultsEfficient Leave-one-out cross validation strategies is 786 times faster than the naive application for a simulated dataset with 1,000 observations and 10,000 markers and 99 times faster with 1,000 observations and 100 markers. These efficiencies relative to the naive approach using the same model will increase with increases in the number of observations.ConclusionsEfficient Leave-one-out cross validation strategies are presented here, requiring little more effort than a single analysis.

Highlights

  • A random multiple-regression model that simultaneously fit all allele substitution effects for additive markers or haplotypes as uncorrelated random effects was proposed for Best Linear Unbiased Prediction, using whole-genome data

  • We refer to them here as marker effect models (MEM) or breeding value models (BVM), the latter often known as Genomic Best Linear Unbiased Prediction (GBLUP)

  • We show below how Leave-one-out cross validation (LOOCV) can be performed using either the MEM or BVM with little more effort than is required for a single analysis with n observations

Read more

Summary

Introduction

A random multiple-regression model that simultaneously fit all allele substitution effects for additive markers or haplotypes as uncorrelated random effects was proposed for Best Linear Unbiased Prediction, using whole-genome data. A random multiple-regression model that simultaneously fit all allele substitution effects for additive markers or haplotypes as uncorrelated random effects was proposed for Best Linear Unbiased Prediction (BLUP) [1], using whole-genome data. An alternative earlier published approach to use marker or haplotype information fits breeding values as random effects based on covariances defined by a “genomic relationship matrix” computed from genotypes [2]. These two models have been shown to be equivalent in terms of predicting breeding values [3, 4]. Naive application of LOOCV is computationally intensive, requiring n analyses

Methods
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.