Abstract

Corynebacterium diphtheriae is highly transmissible and can cause large diphtheria outbreaks where vaccination coverage is insufficient. Sporadic cases or small clusters are observed in high-vaccination settings. The phylogeography and short timescale evolution of C. diphtheriae are not well understood, in part due to a lack of harmonized analytical approaches of genomic surveillance and strain tracking. We combined 1,305 genes with highly reproducible allele calls into a core genome multilocus sequence typing (cgMLST) scheme. We analyzed cgMLST gene diversity among 602 isolates from sporadic clinical cases, small clusters, or large outbreaks. We defined sublineages based on the phylogenetic structure within C. diphtheriae and strains based on the highest number of cgMLST mismatches within documented outbreaks. We performed time-scaled phylogenetic analyses of major sublineages. The cgMLST scheme showed high allele call rate in C. diphtheriae and the closely related species C. belfantii and C. rouxii. We demonstrate its utility to delineate epidemiological case clusters and outbreaks using a 25 mismatches threshold and reveal a number of cryptic transmission chains, most of which are geographically restricted to one or a few adjacent countries. Subcultures of the vaccine strain PW8 differed by up to 20 cgMLST mismatches. Phylogenetic analyses revealed a short-timescale evolutionary gain or loss of the diphtheria toxin and biovar-associated genes. We devised a genomic taxonomy of strains and deeper sublineages (defined using a 500-cgMLST-mismatch threshold), currently comprising 151 sublineages, only a few of which are geographically widespread based on current sampling. The cgMLST genotyping tool and nomenclature was made publicly accessible (https://bigsdb.pasteur.fr/diphtheria). Standardized genome-scale strain genotyping will help tracing transmission and geographic spread of C. diphtheriae. The unified genomic taxonomy of C. diphtheriae strains provides a common language for studies of ecology, evolution, and virulence heterogeneity among C. diphtheriae sublineages.

Highlights

  • Corynebacterium diphtheriae is highly transmissible and can cause large diphtheria outbreaks where vaccination coverage is insufficient

  • Because loci with high numbers of missing alleles were filtered out during core genome Multilocus Sequence Typing (cgMLST) scheme construction, the allele call rate of the resulting scheme was assessed using an independent set of genomes, which had not been used for the above-described definition of the scheme

  • We defined a set of 1,305 protein-coding gene loci deemed appropriate for C. diphtheriae, C. belfantii and C. rouxii genotyping

Read more

Summary

Background

Corynebacterium diphtheriae is highly transmissible and can cause large diphtheria outbreaks where vaccination coverage is insufficient. Diphtheriae are not well understood, in part due to a lack of harmonized analytical approaches of genomic surveillance and strain tracking

Methods
Results
Conclusions
Evaluation of cgMLST allele call rate
Ethical approval statement

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.