16S rRNA genes sequencing has been used for routine species identification and phylogenetic studies of bacteria. However, the high sequence similarity between some species and heterogeneity within copies at the intragenomic level could be a limiting factor of discriminatory ability. In this study, we aimed to compare 16S rRNA genes sequences and genome-based analysis (core SNPs and ANI) for identification of non-pathogenic Yersinia. We used complete and draft genomes of 373 Yersinia strains from the NCBI Genome database. The taxonomic affiliations of 34 genomes based on core SNPs and the ANI results did not match those specified in the GenBank database (NCBI). The intragenic homology of the 16S rRNA gene copies exceeded 99.5% in complete genomes, but above 50% of genomes have four or more variants of the 16S rRNA gene. Among 327 draft genomes of non-pathogenic Yersinia, 11% did not have a full-length 16S rRNA gene. Most of draft genomes has one copy of gene and it is not possible to define the intragenomic heterogenicity. The average homology of 16S rRNA gene was 98.76%, and the maximum variability was 2.85%. The low degree of genetic heterogenicity of the gene (0.36%) was determined in group Y. pekkanenii/Y. proxima/Y. aldovae/Y. intermedia/Y. kristensenii/Y. rochesterensis. The identical gene sequences were found in the genomes of the Y. intermedia and Y. rochesterensis strains identified using ANI and core SNPs analyses. The phylogenetic tree based on 16S rRNA genes differed from the tree based on core SNPs of the genomes and did not represent phylogenetic relationship between the Yersinia species. These findings will help to fill the data gaps in genome characteristics of deficiently studied non-pathogenic Yersinia.
Read full abstract