Abstract

Whole genome sequencing is increasingly recognized as the most informative approach for characterization of bacterial isolates. Success of the routine use of this technology in public health laboratories depends on the availability of well-characterized and verified data analysis methods. However, multiple subtyping workflows are now often being used for a single organism, and differences between them are not always well described. Moreover, methodologies for comparison of subtyping workflows, and assessment of their performance are only beginning to emerge. Current work focuses on the detailed comparison of WGS-based subtyping workflows and evaluation of their suitability for the organism and the research context in question. We evaluated the performance of pipelines used for subtyping of Neisseria meningitidis, including the currently widely applied cgMLST approach and different SNP-based methods. In addition, the impact of the use of different tools for detection and filtering of recombinant regions and of different reference genomes were tested. Our benchmarking analysis included both assessment of technical performance of the pipelines and functional comparison of the generated genetic distance matrices and phylogenetic trees. It was carried out using replicate sequencing datasets of high- and low-coverage, consisting mainly of isolates belonging to the clonal complex 269. We demonstrated that cgMLST and some of the SNP-based subtyping workflows showed very good performance characteristics and highly similar genetic distance matrices and phylogenetic trees with isolates belonging to the same clonal complex. However, only two of the tested workflows demonstrated reproducible results for a group of more closely related isolates. Additionally, results of the SNP-based subtyping workflows were to some level dependent on the reference genome used. Interestingly, the use of recombination-filtering software generally reduced the similarity between the gene-by-gene and SNP-based methodologies for subtyping of N. meningitidis. Our study, where N. meningitidis was taken as an example, clearly highlights the need for more benchmarking comparative studies to eventually contribute to a justified use of a specific WGS data analysis workflow within an international public health laboratory context.

Highlights

  • Whole genome sequencing (WGS) is becoming increasingly recognized in a public health context as a single-shot method to determine species (Wood and Salzberg, 2014; Petersen et al, 2017), serotype (Joensen et al, 2015; Yoshida et al, 2016), antibiotic resistance (McDermott et al, 2016; Eyre et al, 2017) and virulence characteristics of pathogens (Schreiber et al, 2017; Mason et al, 2018)

  • The most widely used approaches for extraction of high-resolution subtyping and relatedness information from WGS data can be grouped into methods based on core genome/whole genome multilocus sequence typing, termed geneby-gene approaches, and methods based on single nucleotide polymorphism (SNP) detection (ECDC, 2016)

  • The WGS-based analysis of the N. meningitidis isolate selection showed that B:NT isolates belonging to the cc269, and three B:NT:P1.14 isolates with undetermined clonal complex formed a separate phylogenetic clade confirming the previous findings of Bertrand et al (2011)

Read more

Summary

Introduction

Whole genome sequencing (WGS) is becoming increasingly recognized in a public health context as a single-shot method to determine species (Wood and Salzberg, 2014; Petersen et al, 2017), serotype (Joensen et al, 2015; Yoshida et al, 2016), antibiotic resistance (McDermott et al, 2016; Eyre et al, 2017) and virulence characteristics of pathogens (Schreiber et al, 2017; Mason et al, 2018) This technology permits to most precisely determine the genetic differences between isolates, which are used for subtyping and to create phylogenies for surveillance and epidemiologic investigations of disease outbreaks (Qiu et al, 2015; Jackson et al, 2016; Durand et al, 2018). Such regions are often filtered out using specific recombination detection tools such as ClonalframeML (Didelot and Wilson, 2015) and Gubbins (Page et al, 2014). cgMLST methods on the contrary are relatively robust to such evolutionary events, collapsing regions with high SNP densities into a small number of allelic changes

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call