Shigella is one of the commonest causes of diarrhoea worldwide and a major public health problem. Shigella serotyping is based on a standardized scheme that splits Shigella strains into four serogroups and 60 serotypes on the basis of biochemical tests and O-antigen structures. This conventional serotyping method is laborious, time-consuming, impossible to automate, and requires a high level of expertise. Whole-genome sequencing (WGS) is becoming more affordable and is now used for routine surveillance, opening up possibilities for the development of much-needed accurate rapid typing methods. Here, we describe ShigaPass, a new in silico tool for predicting Shigella serotypes from WGS assemblies on the basis of rfb gene cluster DNA sequences, phage and plasmid-encoded O-antigen modification genes, seven housekeeping genes (EnteroBase's MLST scheme), fliC alleles and clustered regularly interspaced short palindromic repeats (CRISPR) spacers. Using 4879 genomes, including 4716 reference strains and clinical isolates of Shigella characterized with a panel of biochemical tests and serotyped by slide agglutination, we show here that ShigaPass outperforms all existing in silico tools, particularly for the identification of Shigella boydii and Shigella dysenteriae serotypes, with a correct serotype assignment rate of 98.5 % and a sensitivity rate (i.e. ability to make any prediction) of 100 %.
Read full abstract