Genome sequences contain the fundamental genetic information that largely determines the biology of a species. Over the past 20 years, advancements in high-throughput sequencing technologies and bioinformatics tools have matured, facilitating genome assembly and ushering in the telomere-to-telomere (T2T) era. Bombyx mori is renowned as a silk-producing insect and serves as an important model organism extensively studied across various fields of biology. In this study, we present the first assembled T2T genome by integrating HiFi, ultra-long ONT, NGS, and Hi-C data. This assembly comprises 450,267,439 base pairs from 28 chromosomes and includes annotations for a total of 18,253 protein-coding genes. A completeness evaluation revealed that 99.1% of conserved single-copy genes were included, as determined by a BUSCO analysis. Furthermore, the consensus quality (QV) assessed through Merqury was recorded at 59.88. The proportion of repeat sequence achieved 60.77%, marking it as the highest reported value for B. mori to date. In comparison to previously published genomes, our assembly offers a more complete and higher quality representation, particularly concerning highly homologous tandem regions such as telomeres, rDNA clusters, and Gr family regions. Furthermore, our extensive experience in genome assembly, including sample preparation experience and assembly strategies to reduce complexity, will provide valuable references for other species aiming to achieve their own T2T genome assemblies.
Read full abstract