Abstract

Decoding complete genome sequences is prerequisite for comprehensive genomics studies. However, the currently available reference genome sequences of Brassica rapa (A genome), B. oleracea (C) and B. napus (AC) cover 391, 540, and 850 Mbp and represent 80.6, 85.7, and 75.2% of the estimated genome size, respectively, while remained are hidden or unassembled due to highly repetitive nature of these genome components. Here, we performed the first comprehensive genome-wide analysis using low-coverage whole-genome sequences to explore the hidden genome components based on characterization of major repeat families in the B. rapa and B. oleracea genomes. Our analysis revealed 10 major repeats (MRs) including a new family comprising about 18.8, 10.8, and 11.5% of the A, C and AC genomes, respectively. Nevertheless, these 10 MRs represented less than 0.7% of each assembled reference genome. Genomic survey and molecular cytogenetic analyses validates our insilico analysis and also pointed to diversity, differential distribution, and evolutionary dynamics in the three Brassica species. Overall, our work elucidates hidden portions of three Brassica genomes, thus providing a resource for understanding the complete genome structures. Furthermore, we observed that asymmetrical accumulation of the major repeats might be a cause of diversification between the A and C genomes.

Highlights

  • Members of the Brassicaceae represent one of the largest eudicot families, including about 338 genera and 3740 species, which have been highly diversified by complex whole genome duplication (WGD) and subsequent evolution

  • We previously demonstrated that de novo assembly using low-coverage, whole-genome sequences can be used for complete and simultaneous assembly of high-copy genomes such as the chloroplast and nuclear ribosomal DNA50

  • Contigs were ordered based on read depth, and initially, the top 50 high-depth contigs were selected for further repeat analysis

Read more

Summary

Introduction

Members of the Brassicaceae represent one of the largest eudicot families, including about 338 genera and 3740 species, which have been highly diversified by complex whole genome duplication (WGD) and subsequent evolution. Repetitive elements (REs) are major players in genome reorganization and stabilization during and after WGD events that disrupt nuclear homeostasis[8] This concept, and the high genome diversity in Brassica, provides a good www.nature.com/scientificreports/. Housekeeping nuclear ribosomal DNA (nrDNA) sequences are one of the largest tandem array repeats[16] They are localized in the peri-centromeric regions (5S nrDNA) and nucleolar organizer regions (45S nrDNA) of most plant species, including Brassica[17,18,19]. TEs are abundant and important for genome expansion, adaptation and evolution[26,27,28] Based on their transposition mechanisms, TEs are classified into two major classes: I, retrotransposons, and II, DNA transposons[29]. In Brassica, asymmetric TE amplification may be important in genetic diversity, speciation, morphological differentiation and polyploidy adaptation[6,33]

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.