Abstract

In genomic selection (GS), all the markers across the entire genome are used to conduct marker-assisted selection such that each quantitative trait locus of complex trait is in linkage disequilibrium with at least one marker. Although GS improves estimated breeding values and genetic gain, in most GS models genetic variance is estimated from training samples with many trait-irrelevant markers, which leads to severe overfitting in the calculation of trait heritability. In this study, we demonstrated overfitting heritability due to the inclusion of trait-irrelevant markers using a series of simulations, and such overfitting can be effectively controlled by cross validation experiment. In the proposed method, the genetic variance is simply the variance of the genetic values predicted through cross validation, the residual variance is the variance of the differences between the observed phenotypic values and the predicted genetic values, and these two resultant variance components are used for calculating the unbiased heritability. We also demonstrated that the heritability calculated through cross validation is equivalent to trait predictability, which objectively reflects the applicability of the GS models. The proposed method can be implemented with the Mixed Procedure in SAS or with our R package “GSMX” which is publically available at https://cran.r-project.org/web/packages/GSMX/index.html.

Highlights

  • Plant breeding is to produce desired characteristics by changing the traits of plants

  • To fit analysis of variance (ANOVA), one can first reorganize the data by categorizing subjects in the sample into groups based on their genotypes (for example, recombinant inbred lines (RILs)), and analyze the variances between these groups

  • Genomic selection (GS) is a form of marker assisted selection (MAS) in which genetic markers covering the whole genome are used so that all quantitative trait loci (QTL) are in linkage disequilibrium with at least one mar ker[9,10,16,18,19]

Read more

Summary

Zhenyu Jia

In genomic selection (GS), all the markers across the entire genome are used to conduct markerassisted selection such that each quantitative trait locus of complex trait is in linkage disequilibrium with at least one marker. In GS analyses, the whole-genome markers are used to fit the regression model and estimate the covariance (kinship) between individuals in the training set; such information is subsequently used to calculate the parameters including variance components which are used to calculate trait heritability. A simple solution was proposed to estimate genetic covariance and eventually trait heritability using the variance of the genetic values predicted through cross validation. The aims of the study include (1) proof of the overfitting due to the inclusion of a large number of trait-irrelevant loci in the GS analyses and the similar overfitting in ANOVA approaches, (2) demonstration of effective control of such overfitting by the new method, and (3) showing that the heritability is equivalent to the predictability (or accuracy of prediction) when such overfitting is controlled. The proposed method in this study has been demonstrated by a series of Monte Carlo simulation experiments and a real data analysis in rice

Materials and Methods
The variance of y is
XTy y
Trait YD GW TN GN YD GW TN GN
Results and Discussion
Number of Replicates Haov
Additional Information
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.