Abstract

Multiple-Locus Variable Number of Tandem Repeats (VNTR) Analysis (MLVA) is widely used by laboratory-based surveillance networks for subtyping pathogens causing foodborne and water-borne disease outbreaks. However, Whole Genome Sequencing (WGS) has recently emerged as the new more powerful reference for pathogen subtyping, making a data conversion method necessary which enables the users to compare the MLVA identified by either method. The MLVAType shiny application was designed to extract MLVA profiles of Vibrio cholerae isolates from WGS data while ensuring backward compatibility with traditional MLVA typing methods. To test and validate the MLVAType algorithm, WGS-derived MLVA profiles of nineteen Vibrio cholerae isolates from Democratic Republic of the Congo (n = 9) and Uganda (n = 10) were compared to MLVA profiles generated by an in silico PCR approach and Sanger sequencing, the latter being used as the reference method. Results obtained by Sanger sequencing and MLVAType were totally concordant. However, the latter were affected by censored estimations whose percentage was inversely proportional to the k-mer parameter used during genome assembly. With a k-mer of 127, less than 15% estimation of V. cholerae VNTR was censored. Preventing censored estimation was only achievable when using a longer k-mer size (i.e. 175), which is not proposed in the SPAdes v.3.13.0 software. As NGS read lengths and qualities tend to increase with time, one may expect the increase of k-mer size in a near future. Using MLVAType application with a longer k-mer size will then efficiently retrieve MLVA profiles from WGS data while avoiding censored estimation.

Highlights

  • Rapid molecular typing of pathogens associated with human and animal diseases has proven instrumental in the surveillance and control of infectious diseases [1, 2]

  • To test and validate the MLVAType algorithm, Whole Genome Sequencing (WGS)-derived MLVA profiles of nineteen Vibrio cholerae isolates from Democratic Republic of the Congo (n = 9) and Uganda (n = 10) were compared to MLVA profiles generated by an in silico PCR approach and Sanger sequencing, the latter being used as the reference method

  • The latter were affected by censored estimations whose percentage was inversely proportional to the k-mer parameter used during genome assembly

Read more

Summary

Introduction

Rapid molecular typing of pathogens associated with human and animal diseases has proven instrumental in the surveillance and control of infectious diseases [1, 2]. Pulsed field gel-electrophoresis (PFGE), which was long considered as the gold standard for molecular typing of pathogens associated with outbreaks, has been superseded by Multi-Locus Sequence Typing (MLST) or Multi-Locus Variable Number of Tandem Repeats (VNTR) Analysis (MLVA), and more recently by Whole Genome Sequencing (WGS) [3]. In silico methods which extract low-throughput typing results (e.g., MLST or MLVA) from WGS data should be developed to enable users to compare subtyping results irrespective of the methodology and time of data acquisition. Both parameters are important when WGS data need to be compared with data generated before the WGS era.

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call