Abstract

Key messageThe major soy protein QTL, cqProt-003, was analysed for haplotype diversity and global distribution, and results indicate 304 bp deletion and variable tandem repeats in protein coding regions are likely causal candidates.Here, we present association and linkage analysis of 985 wild, landrace and cultivar soybean accessions in a pan genomic dataset to characterize the major high-protein/low-oil associated locus cqProt-003 located on chromosome 20. A significant trait-associated region within a 173 kb linkage block was identified, and variants in the region were characterized, identifying 34 high confidence SNPs, 4 insertions, 1 deletion and a larger 304 bp structural variant in the high-protein haplotype. Trinucleotide tandem repeats of variable length present in the second exon of gene Glyma.20G085100 are strongly correlated with the high-protein phenotype and likely represent causal variation. Structural variation has previously been found in the same gene, for which we report the global distribution of the 304 bp deletion and have identified additional nested variation present in high-protein individuals. Mapping variation at the cqProt-003 locus across demographic groups suggests that the high-protein haplotype is common in wild accessions (94.7%), rare in landraces (10.6%) and near absent in cultivated breeding pools (4.1%), suggesting its decrease in frequency primarily correlates with domestication and continued during subsequent improvement. However, the variation that has persisted in under-utilized wild and landrace populations holds high breeding potential for breeders willing to forego seed oil to maximize protein content. The results of this study include the identification of distinct haplotype structures within the high-protein population, and a broad characterization of the genomic context and linkage patterns of cqProt-003 across global populations, supporting future functional characterization and modification.

Highlights

  • Shifting climatic and ecological conditions threaten global food security at a time when the growing human population requires crop yields to increase an estimated + 50% to + 110% by 2050 (Alexandratos and Bruinsma 2012; RayCommunicated by Volker Hahn.et al 2013; Tilman et al 2011; van Dijk et al 2021)

  • A genome-wide association study (GWAS) for protein content was conducted on chromosome 20 using 985 accessions, including 131 wild lines, 708 landraces, 44 old cultivars and 102 modern cultivars, for which phenotypic data was available from the UWA SoyPan dataset

  • Given the 9.3% missing variant information from lines that did not align at position 31,649,589 (Figure S5, Table S2), the SNP at 31,632,556 was taken as the most confident GWAS result using this method

Read more

Summary

Introduction

Shifting climatic and ecological conditions threaten global food security at a time when the growing human population requires crop yields to increase an estimated + 50% to + 110% by 2050 (Alexandratos and Bruinsma 2012; RayCommunicated by Volker Hahn.et al 2013; Tilman et al 2011; van Dijk et al 2021). Domestication and improvement of major crops have led to genetic bottlenecks and reduced diversity due to strong selection for agronomic traits, especially in self-pollinating plant species such as soybean (Glycine max (L.) Merr.) (Hyten et al 2006). Whilst intensive breeding efforts have increased crop productivity, it has left the regions of the soybean genome under selection with low genetic diversity in some modern breeding populations (Zhao et al 2015). The lack of variation in these regions is concerning, as they have proven to play important roles in plant function or morphology, and yet there is limited allelic variation remaining in modern lines for trait expansion and adaptation. When dissecting the genomic regions underlying agronomic traits it is important to look beyond traditional breeding populations and capture the full range of potential diversity. In soybean, ancestral diversity persists in the wild progenitor Glycine soja (Siebold & Zucc.) and exotic

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call