Abstract

De novo microbial genome sequencing reached a turning point with third-generation sequencing (TGS) platforms, and several microbial genomes have been improved by TGS long reads. Bacillus subtilis natto is closely related to the laboratory standard strain B. subtilis Marburg 168, and it has a function in the production of the traditional Japanese fermented food “natto.” The B. subtilis natto BEST195 genome was previously sequenced with short reads, but it included some incomplete regions. We resequenced the BEST195 genome using a PacBio RS sequencer, and we successfully obtained a complete genome sequence from one scaffold without any gaps, and we also applied Illumina MiSeq short reads to enhance quality. Compared with the previous BEST195 draft genome and Marburg 168 genome, we found that incomplete regions in the previous genome sequence were attributed to GC-bias and repetitive sequences, and we also identified some novel genes that are found only in the new genome.

Highlights

  • New sequencing technologies, called second-generation sequencing, have changed the landscape of whole-genome sequencing by reducing the cost of sequencing and increasing throughput exponentially over first-generation Sanger [1] sequencing

  • Genomic DNA of B. subtilis natto was extracted from B. subtilis natto BEST195 [22] and whole-genome shotgun (WGS) sequences were obtained using PacBio RS and Illumina MiSeq

  • Long reads produced by third-generation sequencing (TGS) platforms decreased the difficulty of genome assembly, making it possible to obtain almost single digit scaffolds in de novo assemblies of organisms with small genomes

Read more

Summary

Introduction

New sequencing technologies, called second-generation sequencing, have changed the landscape of whole-genome sequencing by reducing the cost of sequencing and increasing throughput exponentially over first-generation Sanger [1] sequencing. Through this revolution of DNA sequencing, many scientists can attempt whole-genome shotgun (WGS) sequencing of any organisms. A key reason for this difficulty is the de novo assembly of the genome from short reads [4]. Recognizing repetitive sequences is needed for accurate de novo assembly of a genome, but most secondgeneration technologies produce relatively short reads, with maximum lengths of 300 bp using Illumina and 1000 bp using

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.