Soybean (Glycine max) is a leguminous plant with a broad range of applications, particularly in agriculture and food production, where its seed composition—especially oil and protein content—is highly valued. Improving these traits is a primary focus of soybean breeding programs. In this study, we conducted a genome-wide association study (GWAS) to identify genetic loci linked to oil and protein content in seeds, using imputed genotype data for 180 Eurasian soybean varieties and the novel “genotypic twins” approach. This dataset encompassed 87 Russian and European cultivars and 93 breeding lines from Western Siberia. We identified 11 novel loci significantly associated with oil and protein content in seeds (p-value < 1.5 × 10−6), including one locus on chromosome 11 linked to protein content and 10 loci associated with oil content (chromosomes 1, 5, 11, 16, 17, and 18). The protein-associated locus is located near a gene encoding a CBL-interacting protein kinase, which is involved in key biological processes, including stress response mechanisms such as drought and osmotic stress. The oil-associated loci were linked to genes with diverse functions, including lipid transport, nutrient reservoir activity, and stress responses, such as Sec14p-like phosphatidylinositol transfer proteins and Germin-like proteins. These findings suggest that the loci identified not only influence oil and protein content but may also contribute to plant resilience under environmental stress conditions. The data obtained from this study provide valuable genetic markers that can be used in breeding programs to optimize oil and protein content, particularly in varieties adapted to Russian climates, and contribute to the development of high-yielding, nutritionally enhanced soybean cultivars.
Read full abstract