Abstract

The lipopolysaccharide (O) and flagellar (H) surface antigens of Escherichia coli are targets for serotyping that have traditionally been used to identify pathogenic lineages. These surface antigens are important for the survival of E. coli within mammalian hosts. However, traditional serotyping has several limitations, and public health reference laboratories are increasingly moving towards whole genome sequencing (WGS) to characterize bacterial isolates. Here we present a method to rapidly and accurately serotype E. coli isolates from raw, short read WGS data. Our approach bypasses the need for de novo genome assembly by directly screening WGS reads against a curated database of alleles linked to known and novel E. coli O-groups and H-types (the EcOH database) using the software package srst2. We validated the approach by comparing in silico results for 197 enteropathogenic E. coli isolates with those obtained by serological phenotyping in an independent laboratory. We then demonstrated the utility of our method to characterize isolates in public health and clinical settings, and to explore the genetic diversity of >1500 E. coli genomes from multiple sources. Importantly, we showed that transfer of O- and H-antigen loci between E. coli chromosomal backbones is common, with little evidence of constraints by host or pathotype, suggesting that E. coli ‘strain space’ may be virtually unlimited, even within specific pathotypes. Our findings show that serotyping is most useful when used in combination with strain genotyping to characterize microevolution events within an inferred population structure.

Highlights

  • Escherichia coli is a Gram-negative bacillus that is a gut commensal, as well as a leading cause of diarrhoea, foodborne outbreaks globally and various extra-intestinal infections

  • This study sought to explore the serotype diversity in a large collection of E. coli genomic read sets. To this end we utilized three datasets, the 185 atypical EPEC (aEPEC) isolates, 362 enterotoxigenic E. coli (ETEC) isolates and a total of 1000 isolates from GenomeTrakr, giving a total of 1547 E. coli genomes for analysis

  • This study has shown that E. coli O- and H-genotypes can be rapidly and accurately extracted direct from whole genome sequence (WGS) reads

Read more

Summary

Introduction

Escherichia coli is a Gram-negative bacillus that is a gut commensal, as well as a leading cause of diarrhoea, foodborne outbreaks globally and various extra-intestinal infections. Differentiation of E. coli isolates has traditionally been performed by serological typing (serotyping) of the highly polymorphic somatic- (O) and flagellar- (H) antigens to identify pathogenic lineages of E. coli (pathotypes) (DebRoy et al, 2011; Robins-Browne, 1987; Wang et al, 2003). There are 182 E. coli O-groups and 53 H-types recognized by traditional serotyping (Croxen et al, 2013; Iguchi et al, 2014; Joensen et al, 2015). Several serotypes are considered to be markers of pathogenic E. coli and are routinely screened for in public health and food industry settings.

Objectives
Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.