Abstract

Key messageHistorical data from breeding programs can be efficiently used to improve genomic selection accuracy, especially when the training set is optimized to subset individuals most informative of the target testing set.The current strategy for large-scale implementation of genomic selection (GS) at the International Maize and Wheat Improvement Center (CIMMYT) global maize breeding program has been to train models using information from full-sibs in a “test-half-predict-half approach.” Although effective, this approach has limitations, as it requires large full-sib populations and limits the ability to shorten variety testing and breeding cycle times. The primary objective of this study was to identify optimal experimental and training set designs to maximize prediction accuracy of GS in CIMMYT’s maize breeding programs. Training set (TS) design strategies were evaluated to determine the most efficient use of phenotypic data collected on relatives for genomic prediction (GP) using datasets containing 849 (DS1) and 1389 (DS2) DH-lines evaluated as testcrosses in 2017 and 2018, respectively. Our results show there is merit in the use of multiple bi-parental populations as TS when selected using algorithms to maximize relatedness between the training and prediction sets. In a breeding program where relevant past breeding information is not readily available, the phenotyping expenditure can be spread across connected bi-parental populations by phenotyping only a small number of lines from each population. This significantly improves prediction accuracy compared to within-population prediction, especially when the TS for within full-sib prediction is small. Finally, we demonstrate that prediction accuracy in either sparse testing or “test-half-predict-half” can further be improved by optimizing which lines are planted for phenotyping and which lines are to be only genotyped for advancement based on GP.

Highlights

  • The Food and Agriculture Organization (FAO) estimates that by 2050 the world’s population will surpass 9 billion people (Nations and United Nations 2019)

  • The prediction accuracy of a target population obtained from models calibrated with different (n) subsets of individuals selected from DS2 using CDmean and Avg_GRM was compared to prediction accuracy from an average of 50 random selections of different subset sizes as above, using half-sib related populations and all the DS2 as training population (TP)

  • Prediction accuracies were calculated as the Pearson correlation of the predicted genomic estimated breeding values (GEBV) and the best linear unbiased estimate (BLUE) estimates of doubled haploid (DH) lines in DS2 obtained from Eq 1

Read more

Summary

Introduction

The Food and Agriculture Organization (FAO) estimates that by 2050 the world’s population will surpass 9 billion people (Nations and United Nations 2019) Much of this population growth will occur in regions of the world where food insecurity is prevalent, with large increases in food demand projected in Sub-Saharan Africa (SSA) and South Asia (SA). Advances in the use of genomic information in crop breeding programs have the potential to significantly increase genetic gains. Multiple studies have shown the potential of this methodology to increase the rates of genetic gain in breeding programs by reducing the cost and time associated with extensive phenotyping of new offspring to identify the best performers for use as parents in the generation (de los Campos et al 2010; Crossa et al 2010, 2011, 2017; Lin et al 2014; Hickey et al 2017; Beyene et al 2019). GS can improve breeding program efficiency if properly designed and implemented to fully harness and maximize its advantages

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call