Abstract

Rapeseed (Brassica napus) is the second most important oilseed crop in the world but the genetic diversity underlying its massive phenotypic variations remains largely unexplored. Here, we report the sequencing, de novo assembly and annotation of eight B. napus accessions. Using pan-genome comparative analysis, millions of small variations and 77.2–149.6 megabase presence and absence variations (PAVs) were identified. More than 9.4% of the genes contained large-effect mutations or structural variations. PAV-based genome-wide association study (PAV-GWAS) directly identified causal structural variations for silique length, seed weight and flowering time in a nested association mapping population with ZS11 (reference line) as the donor, which were not detected by single-nucleotide polymorphisms-based GWAS (SNP-GWAS), demonstrating that PAV-GWAS was complementary to SNP-GWAS in identifying associations to traits. Further analysis showed that PAVs in three FLOWERING LOCUS C genes were closely related to flowering time and ecotype differentiation. This study provides resources to support a better understanding of the genome architecture and acceleration of the genetic improvement of B. napus.

Highlights

  • Rapeseed (Brassica napus) is the second most important oilseed crop in the world but the genetic diversity underlying its massive phenotypic variations remains largely unexplored

  • Our k-mer analysis suggested a genome size of 1,200–1,280 Mb for each genome (Supplementary Table 5), which is close to the estimated genome size of B. napus (~1,132 Mb) according to flow cytometry analysis[17]

  • The results showed that the haplotypes of the three transposable elements (TEs) were more consistent with ecotype information and flowering time than the haplotypes of the single-nucleotide polymorphism (SNP) (Supplementary Fig. 36) and this result was supported by Principal component analysis (PCA) (Supplementary Fig. 34), suggesting that these TE insertions in BnaA10.FLOWERING LOCUS C (FLC) could be used to roughly classify B. napus lines with unknown ecotype information into specific ecotypes, which would be very useful for rapeseed breeding

Read more

Summary

Introduction

Rapeseed (Brassica napus) is the second most important oilseed crop in the world but the genetic diversity underlying its massive phenotypic variations remains largely unexplored. The current B. napus genomes were primarily assembled on the basis of 454 GS-FLX + Titanium and Sanger sequence, next-generation sequencing (NGS) data or medium-coverage PacBio single-molecule real-time (SMRT) sequencing data Their accuracy and completeness are unsatisfactory for identifying structural variations (SVs), which are major contributors to genetic diversity and play key roles in the determination of agronomic traits in many crop species[9,10]. Pan-genomes have been constructed on the basis of NGS technologies for major crops, including soybean, maize, rapeseed and rice, using different numbers of individuals[9,12,13,14] These pan-genomes play important roles in the identification of SVs, including copy number variants (CNVs) and presence and absence variations (PAVs) that are associated with crop agronomic traits[10]. As a proof of concept for the importance of the pan-genome, we identified the causal PAVs that control silique length, seed weight and flowering time of oilseed rape

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.