The bacterium Bacillus subtilis is a widely used study model and industrial workhorse organism that belongs to the group of gram-positive bacteria. In this study, we report the analysis of a newly sequenced complete genome of B. subtilis strain SRCM117797 along with a comparative genomics of a large collection of B. subtilis strain genomes. B. subtilis strain SRCM117797 has 4,255,638bp long chromosome with 43.4% GC content and high coding sequence association with macromolecules, metabolism, and phage genes. Genomic diversity analysis of 232 B. subtilis strains resulted in the identification of eight clusters and three singletons. Of 147 B. subtilis strains included, 89.12% had strain-specific genes, of which 6.75% encoded strain-specific insertion sequence family transposases. Our analysis showed a potential role of strain-specific insertion sequence family transposases in intra-cellular accumulation of strain-specific genes. Furthermore, the chromosomal layout of the core genes was biased: overrepresented on the upper half (closer to the origin of replication) of the chromosome, which may explain the fast-growing characteristics of B. subtilis. Overall, the study provides a complete genome sequence of B. subtilis strain SRCM117797, show an extensive genomic diversity of B. subtilis strains and insights into strain diversification mechanism and non-random chromosomal layout of core genes.
Read full abstract