A Ranking Approach to Genomic Selection.

Mathieu Blondel,Hiroyoshi Iwata,Akio Onogi,Naonori Ueda,Alberto De La Fuente

doi:10.1371/journal.pone.0128570

Mathieu Blondel, Hiroyoshi Iwata + Show 3 more

Open Access

https://doi.org/10.1371/journal.pone.0128570

Copy DOI

Journal: PLOS ONE	Publication Date: Jun 12, 2015
Citations: 78	License type: CC BY 4.0

Affiliation: NTT (Japan), The University of Tokyo

Abstract

BackgroundGenomic selection (GS) is a recent selective breeding method which uses predictive models based on whole-genome molecular markers. Until now, existing studies formulated GS as the problem of modeling an individual’s breeding value for a particular trait of interest, i.e., as a regression problem. To assess predictive accuracy of the model, the Pearson correlation between observed and predicted trait values was used.ContributionsIn this paper, we propose to formulate GS as the problem of ranking individuals according to their breeding value. Our proposed framework allows us to employ machine learning methods for ranking which had previously not been considered in the GS literature. To assess ranking accuracy of a model, we introduce a new measure originating from the information retrieval literature called normalized discounted cumulative gain (NDCG). NDCG rewards more strongly models which assign a high rank to individuals with high breeding value. Therefore, NDCG reflects a prerequisite objective in selective breeding: accurate selection of individuals with high breeding value.ResultsWe conducted a comparison of 10 existing regression methods and 3 new ranking methods on 6 datasets, consisting of 4 plant species and 25 traits. Our experimental results suggest that tree-based ensemble methods including McRank, Random Forests and Gradient Boosting Regression Trees achieve excellent ranking accuracy. RKHS regression and RankSVM also achieve good accuracy when used with an RBF kernel. Traditional regression methods such as Bayesian lasso, wBSR and BayesC were found less suitable for ranking. Pearson correlation was found to correlate poorly with NDCG. Our study suggests two important messages. First, ranking methods are a promising research direction in GS. Second, NDCG can be a useful evaluation measure for GS.

Highlights

Traditional selective breeding, based on phenotypic or pedigree information, has led to much genetic improvement
To assess ranking accuracy of a model, we introduce a new measure originating from the information retrieval literature called normalized discounted cumulative gain (NDCG)
Our experimental results suggest that tree-based ensemble methods including McRank, Random Forests and Gradient Boosting Regression Trees achieve excellent ranking accuracy

Summary

Introduction

Traditional selective breeding, based on phenotypic or pedigree information, has led to much genetic improvement. Genomic selection (GS) is a recent selective breeding method which uses predictive models based on whole-genome molecular markers. Until now, existing studies formulated GS as the problem of modeling an individual’s breeding value for a particular trait of interest, i. We propose to formulate GS as the problem of ranking individuals according to their breeding value. To assess ranking accuracy of a model, we introduce a new measure originating from the information retrieval literature called normalized discounted cumulative gain (NDCG). NDCG rewards more strongly models which assign a high rank to individuals with high breeding value. NDCG reflects a prerequisite objective in selective breeding: accurate selection of individuals with high breeding value

Objectives

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Ranking Approach to Genomic Selection.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS ONE

Lead the way for us

Similar Papers

30 Effects of using different dietary standardized ileal digestible lysine plans on growth performance and carcass characteristics of grow-finish pigs from high or low breeding value boars
Jordi Camp ... Luis E Zaragoza
Journal of Animal Science | VOL. 102
Jordi Camp, et. al.Jordi Camp ... Luis E Zaragoza
04 May 2024
Journal of Animal Science | VOL. 102

A genetic programming framework to schedule webpage updates
Aécio S. R. Santos ... Nivio Ziviani
Information Retrieval Journal | VOL. 18
Aécio S. R. Santos, et. al.Aécio S. R. Santos ... Nivio Ziviani
28 Oct 2014
Information Retrieval Journal | VOL. 18

Investigating Citation Linkage with Machine Learning
Hospice Houngbo ... Robert E Mercer
-
Hospice Houngbo, et. al.Hospice Houngbo ... Robert E Mercer
01 Jan 2017
01 Jan 2017

Breeding Value of Quality Protein Maize Inbreds
A Thanga Hemavathy ... S Kavitha
International Journal of Plant & Soil Science | VOL. 35
A Thanga Hemavathy, et. al.A Thanga Hemavathy ... S Kavitha
13 Sep 2023
International Journal of Plant & Soil Science | VOL. 35

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Ranking Approach to Genomic Selection.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS ONE