Abstract

Shiga toxin-producing Escherichia coli (STEC) have more than 470 serotypes. The well-known STEC O157:H7 serotype is a leading cause of STEC infections in humans. However, the incidence of non-O157:H7 STEC serotypes associated with foodborne outbreaks and human infections has increased in recent years. Current detection and serotyping assays are focusing on O157 and top six (“Big six”) non-O157 STEC serogroups. In this study, we performed phylogenetic analysis of nearly 41,000 publicly available STEC genomes representing 460 different STEC serotypes and identified 19 major and 229 minor STEC clusters. STEC cluster-specific gene markers were then identified through comparative genomic analysis. We further identified serotype-specific gene markers for the top 10 most frequent non-O157:H7 STEC serotypes. The cluster or serotype specific gene markers had 99.54% accuracy and more than 97.25% specificity when tested using 38,534 STEC and 14,216 non-STEC E. coli genomes, respectively. In addition, we developed a freely available in silico serotyping pipeline named STECFinder that combined these robust gene markers with established E. coli serotype specific O and H antigen genes and stx genes for accurate identification, cluster determination and serotyping of STEC. STECFinder can assign 99.85% and 99.83% of 38,534 STEC isolates to STEC clusters using assembled genomes and Illumina reads respectively and can simultaneously predict stx subtypes and STEC serotypes. Using shotgun metagenomic sequencing reads of STEC spiked food samples from a published study, we demonstrated that STECFinder can detect the spiked STEC serotypes, accurately. The cluster/serotype-specific gene markers could also be adapted for culture independent typing, facilitating rapid STEC typing. STECFinder is available as an installable package (https://github.com/LanLab/STECFinder) and will be useful for in silico STEC cluster identification and serotyping using genome data.

Highlights

  • Shiga toxin-producing Escherichia coli (STEC) are an important cause of foodborne disease worldwide (Tuttle et al, 1999; Teunis et al, 2008; World Health Organization, 2019)

  • multilocus sequence typing (MLST) typed the 41,101 STEC isolates into 817 STs (202 isolates not typed by MLST) of which 368 STs were represented by a single isolate, 424 STs represented by two to 100 isolates each and accounted for 12% of the STEC isolates, whereas 25 STs contained more than 100 isolates each and encompassed 86.61% of the STEC isolates, of which ST11 is the largest, accounting for 37.12% of the STEC isolates, followed by ST21 (14.71%), ST17 (11.91%), ST16 (6.72%), ST655 (2,71%) and ST32 (2.46%). rMLST divided the 41,101 STEC isolates into 2,911 ribosomal STs (rSTs) (12,208 isolates not typed by rMLST)

  • We showed that the cluster/serotype-specific gene marker sets of interest were detected in the spiked food samples by STECfinder using shotgun metagenomic sequencing reads from the study of Buytaers et al (2020)

Read more

Summary

Introduction

Shiga toxin-producing Escherichia coli (STEC) are an important cause of foodborne disease worldwide (Tuttle et al, 1999; Teunis et al, 2008; World Health Organization, 2019). STEC causes human infections ranging from mild non-bloody diarrhea to haemorrhagic colitis (HC), haemolytic uraemic syndrome (HUS), end-stage renal disease (ESRD) and death (Paton and Paton, 1998; Tarr et al, 2005; Gould et al, 2009). STEC O157:H7 is the most frequent STEC serotype associated with foodborne outbreaks and human infections (Bettelheim, 2000; Qin et al, 2015; Li et al, 2017). Among STEC non-O157:H7 serotypes, six serogroups O26, O45, O103, O111, O121 and O45, known as “The Big six” (comprising nine serotypes: O26:H11/H-; O45:H2; O103: H2, H11, H25; O111:H8/H-; O121:H19 or H7; and O145:H28/ H-) account for over 70% of non-O157:H7 STEC infections (Brooks et al, 2005; Hedican et al, 2009; Bosilevac and Koohmaraie, 2011)

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call