Abstract

Accurate phenotype prediction of quantitative traits is paramount to enhanced plant research and breeding. Here, we report the accurate prediction of cotton fiber length, a typical quantitative trait, using 474 cotton (Gossypium ssp.) fiber length (GFL) genes and nine prediction models. When the SNPs/InDels contained in 226 of the GFL genes or the expressions of all 474 GFL genes was used for fiber length prediction, a prediction accuracy of r = 0.83 was obtained, approaching the maximally possible prediction accuracy of a quantitative trait. This has improved by 116%, the prediction accuracies of the fiber length thus far achieved for genomic selection using genome-wide random DNA markers. Moreover, analysis of the GFL genes identified 125 of the GFL genes that are key to accurate prediction of fiber length, with which a prediction accuracy similar to that of all 474 GFL genes was obtained. The fiber lengths of the plants predicted with expressions of the 125 key GFL genes were significantly correlated with those predicted with the SNPs/InDels of the above 226 SNP/InDel-containing GFL genes (r = 0.892, P = 0.000). The prediction accuracies of fiber length using both genic datasets were highly consistent across environments or generations. Finally, we found that a training population consisting of 100–120 plants was sufficient to train a model for accurate prediction of a quantitative trait using the genes controlling the trait. Therefore, the genes controlling a quantitative trait are capable of accurately predicting its phenotype, thereby dramatically improving the ability, accuracy, and efficiency of phenotype prediction and promoting gene-based breeding in cotton and other species.

Highlights

  • Many traits of agricultural and medical importance, such as crop yield, livestock productivity and human diseases, are known as quantitative traits that are each controlled by numerous genes

  • The prediction accuracy of the cotton fiber length, which is used as the objective trait in this study, has approached its plateaued accuracy, with an accuracy of r = 0.83 (P = 0.000) using either the single nucleotide polymorphisms (SNPs)/InDels of 226 of the 474 GFL genes or the expressions of the 474 GFL genes

  • This prediction accuracy is as accurate as the prediction accuracy of maize grain yield (r = 0.85, P = 0.000), which is one of the most complex quantitative traits, using the maize grain yield (ZmINGY) genes (Zhang et al, 2020a)

Read more

Summary

Introduction

Many traits of agricultural and medical importance, such as crop yield, livestock productivity and human diseases, are known as quantitative traits that are each controlled by numerous genes. Genic Prediction of Quantitative Traits data, thereby enhancing the ability, accuracy, and efficiency of breeding in crop plants (Crossa et al, 2010, 2013; De Los Campos et al, 2010b; Heffner et al, 2011a,b; González-Camacho et al, 2012; Gouy et al, 2013; Desta and Ortiz, 2014; Xu et al, 2014, 2016; Beyene et al, 2015; Dan et al, 2016) and livestock (Meuwissen et al, 2001; Daetwyler et al, 2012; Morota et al, 2014), and medicine in humans (Khan et al, 2001; Lee et al, 2008; De Los Campos et al, 2010a; Speed and Balding, 2014; Weissbrod et al, 2016). The predicted phenotypes of the trait for the individuals of the targeted population are used to make decision for progeny selection in crop plant and livestock breeding, and for medicine practice in humans (De Los Campos et al, 2010a)

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call