A comparison of statistical methods for genomic selection in a mice population

Haroldo Hr Neves,Sandra A Queiroz,Roberto Carvalheiro

doi:10.1186/1471-2156-13-100

Abstract

BackgroundThe availability of high-density panels of SNP markers has opened new perspectives for marker-assisted selection strategies, such that genotypes for these markers are used to predict the genetic merit of selection candidates. Because the number of markers is often much larger than the number of phenotypes, marker effect estimation is not a trivial task. The objective of this research was to compare the predictive performance of ten different statistical methods employed in genomic selection, by analyzing data from a heterogeneous stock mice population.ResultsFor the five traits analyzed (W6W: weight at six weeks, WGS: growth slope, BL: body length, %CD8+: percentage of CD8+ cells, CD4+/ CD8+: ratio between CD4+ and CD8+ cells), within-family predictions were more accurate than across-family predictions, although this superiority in accuracy varied markedly across traits. For within-family prediction, two kernel methods, Reproducing Kernel Hilbert Spaces Regression (RKHS) and Support Vector Regression (SVR), were the most accurate for W6W, while a polygenic model also had comparable performance. A form of ridge regression assuming that all markers contribute to the additive variance (RR_GBLUP) figured among the most accurate for WGS and BL, while two variable selection methods ( LASSO and Random Forest, RF) had the greatest predictive abilities for %CD8+ and CD4+/ CD8+. RF, RKHS, SVR and RR_GBLUP outperformed the remainder methods in terms of bias and inflation of predictions.ConclusionsMethods with large conceptual differences reached very similar predictive abilities and a clear re-ranking of methods was observed in function of the trait analyzed. Variable selection methods were more accurate than the remainder in the case of %CD8+ and CD4+/CD8+ and these traits are likely to be influenced by a smaller number of QTL than the remainder. Judged by their overall performance across traits and computational requirements, RR_GBLUP, RKHS and SVR are particularly appealing for application in genomic selection.

Highlights

The availability of high-density panels of single nucleotide polymorphisms (SNP) markers has opened new perspectives for marker-assisted selection strategies, such that genotypes for these markers are used to predict the genetic merit of selection candidates
We analyze a publicly available dataset, including pedigree, genotypic and phenotypic information of a mice population. This same dataset had already been analyzed previously [6,7,8], we focus on a broader comparison of statistical methods employed for genomic prediction, by studying five traits that probably have considerable differences in terms of genetic architecture
The estimates for weight at 6 weeks (W6W), weight growth slope (WGS) and BL were in agreement to those obtained by [6], being that the largest difference was observed for body length, whose heritability was 7% lower in the present study

Summary

Introduction

The availability of high-density panels of SNP markers has opened new perspectives for marker-assisted selection strategies, such that genotypes for these markers are used to predict the genetic merit of selection candidates. The availability of high-density panels of single nucleotide polymorphisms (SNP) containing thousands of markers opened new perspectives for the study of complex diseases, while has enhanced marker-assisted selection strategies in animal and plant breeding. The possibility to predict accurately the genetic merit of selection candidates based on their genotypes for SNP markers, a process known as genomic selection [1], is revolutionizing breeding schemes. As the number of predictor variables (markers) is generally much higher than the number of observations (phenotypes), there is lack of degrees of freedom to estimate all marker effects simultaneously, what is aggravated by the fact that models may suffer from multicollinearity, especially because markers in close positions are expected to be highly correlated. It has been argued that shrinkage methods with assumptions close to the infinitesimal model (i.e. GBLUP and its variants) are robust with respect to the underlying genetic architecture of the traits, while methods based on some sort of variable selection are more sensitive to the genetic background of traits [3,4]

Objectives

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Genetics	Publication Date: Jan 1, 2012
Citations: 101	License type: cc-by

R Discovery Prime

R Discovery Prime

A comparison of statistical methods for genomic selection in a mice population

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Genetics

Lead the way for us

Similar Papers

A Unified and Comprehensible View of Parametric and Kernel Methods for Genomic Prediction with Application to Rice.
Laval Jacquin ... Nourollah Ahmadi
Frontiers in Genetics | VOL. 7
Laval Jacquin, et. al.Laval Jacquin ... Nourollah Ahmadi
09 Aug 2016
Frontiers in Genetics | VOL. 7

Modelling soil thickness using environmental attributes in karst watersheds
Yaohua Zhang ... Wei Luo
CATENA | VOL. 212
Yaohua Zhang, et. al.Yaohua Zhang ... Wei Luo
21 Jan 2022
CATENA | VOL. 212

Estimating the Growing Stem Volume of Chinese Pine and Larch Plantations based on Fused Optical Data Using an Improved Variable Screening Method and Stacking Algorithm
Xinyu Li ... Hui Lin
Remote Sensing | VOL. 12
Xinyu Li, et. al.Xinyu Li ... Hui Lin
09 Mar 2020
Remote Sensing | VOL. 12

Evaluation of Genomic Prediction Methods for Fusarium Head Blight Resistance in Wheat
Jessica Rutkoski ... Yi Jia
The Plant Genome | VOL. 5
Jessica Rutkoski, et. al.Jessica Rutkoski ... Yi Jia
01 Jul 2012
The Plant Genome | VOL. 5

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A comparison of statistical methods for genomic selection in a mice population

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Genetics