Abstract

BackgroundMultiple-Locus Variable Number of Tandem Repeats (VNTR) Analysis (MLVA) is widely used by laboratory-based surveillance networks for subtyping pathogens causing foodborne and water-borne disease outbreaks. However, Whole Genome Sequencing (WGS) has recently emerged as the new more powerful reference for pathogen subtyping, making a data conversion method necessary which enables the users to compare the MLVA identified by either method. The MLVAType shiny application was designed to extract MLVA profiles of Vibrio cholerae isolates from WGS data while ensuring backward compatibility with traditional MLVA typing methods.MethodsTo test and validate the MLVAType algorithm, WGS-derived MLVA profiles of nineteen Vibrio cholerae isolates from Democratic Republic of the Congo (n = 9) and Uganda (n = 10) were compared to MLVA profiles generated by an in silico PCR approach and Sanger sequencing, the latter being used as the reference method.ResultsResults obtained by Sanger sequencing and MLVAType were totally concordant. However, the latter were affected by censored estimations whose percentage was inversely proportional to the k-mer parameter used during genome assembly. With a k-mer of 127, less than 15% estimation of V. cholerae VNTR was censored. Preventing censored estimation was only achievable when using a longer k-mer size (i.e. 175), which is not proposed in the SPAdes v.3.13.0 software.ConclusionAs NGS read lengths and qualities tend to increase with time, one may expect the increase of k-mer size in a near future. Using MLVAType application with a longer k-mer size will then efficiently retrieve MLVA profiles from WGS data while avoiding censored estimation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call