Rice diversity panel provides accurate genomic predictions for complex traits in the progenies of biparental crosses involving members of the panel

M Ben Hassen,J Bartholomé,L Jacquin,C Colombi,L Razafinimpiasa,N Ahmadi,C Bertone,G Valè,T V Cao,A Volante,C Biselli,J Rakotomalala,G Orasen,F Desiderio

doi:10.1007/s00122-017-3011-4

Abstract

Key messageRice breeding programs based on pedigree schemes can use a genomic model trained with data from their working collection to predict performances of progenies produced through rapid generation advancement.So far, most potential applications of genomic prediction in plant improvement have been explored using cross validation approaches. This is the first empirical study to evaluate the accuracy of genomic prediction of the performances of progenies in a typical rice breeding program. Using a cross validation approach, we first analyzed the effects of marker selection and statistical methods on the accuracy of prediction of three different heritability traits in a reference population (RP) of 284 inbred accessions. Next, we investigated the size and the degree of relatedness with the progeny population (PP) of sub-sets of the RP that maximize the accuracy of prediction of phenotype across generations, i.e., for 97 F5–F7 lines derived from biparental crosses between 31 accessions of the RP. The extent of linkage disequilibrium was high (r2 = 0.2 at 0.80 Mb in RP and at 1.1 Mb in PP). Consequently, average marker density above one per 22 kb did not improve the accuracy of predictions in the RP. The accuracy of progeny prediction varied greatly depending on the composition of the training set, the trait, LD and minor allele frequency. The highest accuracy achieved for each trait exceeded 0.50 and was only slightly below the accuracy achieved by cross validation in the RP. Our results thus show that relatively high accuracy (0.41–0.54) can be achieved using only a rather small share of the RP, most related to the PP, as the training set. The practical implications of these results for rice breeding programs are discussed.

Highlights

Genomic selection (GS) arose from the conjunction of new high-throughput marker technologies and new statistical methods (Meuwissen et al 2001)
The main objectives of this work were to assess the performance of genomic prediction among the progeny of biparental crosses, using a reference panel to train the model in rice, and to investigate the effect of the size of the reference panel and of the degree of relatedness with the progeny population on the accuracy of predictions, as well as the effect of linkage disequilibrium (LD) and the training model
To set a base line for prediction accuracy and to reduce the number of possible options to be tested regarding LD and other characteristics of the incidence matrix, we started our study by evaluating the accuracy of genomic prediction within the reference population using a cross validation approach

Summary

Introduction

Genomic selection (GS) arose from the conjunction of new high-throughput marker technologies and new statistical methods (Meuwissen et al 2001). GS allows analysis of the genetic architecture for complex traits in the framework of infinitesimal model effects. It consists in (1) using all markers (often large numbers) simultaneously to build a model. Theoretical and Applied Genetics (2018) 131:417–435 of genotype–phenotype relationships in a training population (TP), accounting for linkage disequilibrium (LD) among markers, and (2) using the model to predict the genomic estimate of breeding values (GEBV) of candidates in a breeding population (CP) (Meuwissen et al 2001; Jannink et al 2010). The effectiveness of GS depends, among other factors, on the degree of correlation between the predicted GEBV and the true genetic value, i.e., the accuracy of prediction. The accuracy of prediction is evaluated by the correlation between GEBV and the realized phenotype

Objectives

Methods

Results

Conclusion