Abstract

Multilocus sequence typing (MLST) is an effective method to describe bacterial populations. Conventionally, MLST involves Polymerase Chain Reaction (PCR) amplification of housekeeping genes followed by Sanger DNA sequencing. Public Health England (PHE) is in the process of replacing the conventional MLST methodology with a method based on short read sequence data derived from Whole Genome Sequencing (WGS). This paper reports the comparison of the reliability of MLST results derived from WGS data, comparing mapping and assembly-based approaches to conventional methods using 323 bacterial genomes of diverse species. The sensitivity of the two WGS based methods were further investigated with 26 mixed and 29 low coverage genomic data sets from Salmonella enteridis and Streptococcus pneumoniae. Of the 323 samples, 92.9% (n = 300), 97.5% (n = 315) and 99.7% (n = 322) full MLST profiles were derived by the conventional method, assembly- and mapping-based approaches, respectively. The concordance between samples that were typed by conventional (92.9%) and both WGS methods was 100%. From the 55 mixed and low coverage genomes, 89.1% (n = 49) and 67.3% (n = 37) full MLST profiles were derived from the mapping and assembly based approaches, respectively. In conclusion, deriving MLST from WGS data is more sensitive than the conventional method. When comparing WGS based methods, the mapping based approach was the most sensitive. In addition, the mapping based approach described here derives quality metrics, which are difficult to determine quantitatively using conventional and WGS-assembly based approaches.

Highlights

  • The process of whole genome sequencing (WGS) has benefited from recent advances collectively known as generation sequencing, allowing high throughput sequencing of bacterial genomes at low financial cost

  • For 21 Campylobacter sp and 2 Streptococcus pneumoniae samples, a full Multilocus sequence typing (MLST) profile was not returned via the conventional method due to poor sequence quality

  • Having established the superiority of WGS based methods, we went on to compare the performance of two WGS data analysis approaches to determine their accuracy against samples that contained more than one organism and low coverage data

Read more

Summary

Introduction

The process of whole genome sequencing (WGS) has benefited from recent advances collectively known as generation sequencing, allowing high throughput sequencing of bacterial genomes at low financial cost. This results in WGS becoming a viable alternative to some traditional typing methods for public health infectious disease surveillance. De novo assembly/BLAST based approaches work by assembling short reads into longer contiguous sequences and comparing these contigs to a reference allele database using BLAST to assign a MLST type. Mapping based approaches allow the calculation of metrics for each designated allele to assess the quality of the match (Inouye et al, 2012; Inouye et al, 2014)

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call