Abstract

Bacterial genomics is making an increasing contribution to the fields of medicine and public health microbiology. Consequently, accurate species identification of bacterial genomes is an important task, particularly as the number of genomes stored in online databases increases rapidly and new species are frequently discovered. Existing database entries require regular re-evaluation to ensure that species annotations are consistent with the latest species definitions. We have developed an automated method for bacterial species identification that is an extension of ribosomal multilocus sequence typing (rMLST). The method calculates an 'rMLST nucleotide identity' (rMLST-NI) based on the nucleotides present in the protein-encoding ribosomal genes derived from bacterial genomes. rMLST-NI was used to validate the species annotations of 11839 publicly available Klebsiella and Raoultella genomes based on a comparison with a library of type strain genomes. rMLST-NI was compared with two whole-genome average nucleotide identity methods (OrthoANIu and FastANI) and the k-mer based Kleborate software. The results of the four methods agreed across a dataset of 11839 bacterial genomes and identified a small number of entries (n=89) with species annotations that required updating. The rMLST-NI method was 3.5 times faster than Kleborate, 4.5 times faster than FastANI and 1600 times faster than OrthoANIu. rMLST-NI represents a fast and generic method for species identification using type strains as a reference.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call