Abstract

BackgroundTuberculosis (TB) resulted in an estimated 1.7 million deaths in the year 2016. The disease is caused by the members of Mycobacterium tuberculosis complex, which includes Mycobacterium tuberculosis, Mycobacterium bovis and other closely related TB causing organisms. In order to understand the epidemiological dynamics of TB, national TB control programs often conduct standardized genotyping at 24 Mycobacterial-Interspersed-Repetitive-Units (MIRU)-Variable-Number-of-Tandem-Repeats (VNTR) loci. With the advent of next generation sequencing technology, whole-genome sequencing (WGS) has been widely used for studying TB transmission. However, an open-source software that can connect WGS and MIRU-VNTR typing is currently unavailable, which hinders interlaboratory communication. In this manuscript, we introduce the MIRU-profiler program which could be used for prediction of MIRU-VNTR profile from WGS of M. tuberculosis.ImplementationThe MIRU-profiler is implemented in shell scripting language and depends on EMBOSS software. The in-silico workflow of MIRU-profiler is similar to those described in the laboratory manuals for genotyping M. tuberculosis. Given an input genome sequence, the MIRU-profiler computes alleles at the standard 24-loci based on in-silico PCR amplicon lengths. The final output is a tab-delimited text file detailing the 24-loci MIRU-VNTR pattern of the input sequence.ValidationThe MIRU-profiler was validated on four datasets: complete genomes from NCBI-GenBank (n = 11), complete genomes for locally isolated strains sequenced using PacBio (n = 4), complete genomes for BCG vaccine strains (n = 2) and draft genomes based on 250 bp paired-end Illumina reads (n = 106).ResultsThe digital MIRU-VNTR results were identical to the experimental genotyping results for complete genomes of locally isolated strains, BCG vaccine strains and five out of 11 genomes from the NCBI-GenBank. For draft genomes based on short Illumina reads, 21 out of 24 loci were inferred with a high accuracy, while a number of inaccuracies were recorded for three specific loci (ETRA, QUB11b and QUB26). One of the unique features of the MIRU-profiler was its ability to process multiple genomes in a batch. This feature was tested on all complete M. tuberculosis genome (n = 157), for which results were successfully obtained in approximately 14 min.ConclusionThe MIRU-profiler is a rapid tool for inference of digital MIRU-VNTR profile from the assembled genome sequences. The tool can accurately infer repeat numbers at the standard 24 or 21/24 MIRU-VNTR loci from the complete or draft genomes respectively. Thus, the tool is expected to bridge the communication gap between the laboratories using WGS and those using the conventional MIRU-VNTR typing.

Highlights

  • Tuberculosis (TB) is an infectious disease responsible for an estimated 1.7 million deaths worldwide in the year 2016 alone (World Health Organization, 2017)

  • A complete agreement was observed for 20 MIRU-VNTR loci and between one to four mismatches for the remaining 4-loci

  • On all sets of M. tuberculosis strains, a good agreement between the MIRU-profiler and the experimental results was observed, which provides a confidence in in-silico inference of 24-loci MIRU-VNTR profiles from whole-genome sequencing (WGS) using this tool

Read more

Summary

Introduction

Tuberculosis (TB) is an infectious disease responsible for an estimated 1.7 million deaths worldwide in the year 2016 alone (World Health Organization, 2017). In order to understand the epidemiological dynamics of TB, national TB control programs often conduct standardized genotyping at 24 Mycobacterial-Interspersed-Repetitive-Units (MIRU)-Variable-Number-of-TandemRepeats (VNTR) loci. Given an input genome sequence, the MIRU-profiler computes alleles at the standard 24-loci based on in-silico PCR amplicon lengths. The MIRU-profiler was validated on four datasets: complete genomes from NCBI-GenBank (n = 11), complete genomes for locally isolated strains sequenced using PacBio (n = 4), complete genomes for BCG vaccine strains (n = 2) and draft genomes based on 250 bp paired-end Illumina reads (n = 106). One of the unique features of the MIRU-profiler was its ability to process multiple genomes in a batch This feature was tested on all complete M. tuberculosis genome (n = 157), for which results were successfully obtained in approximately 14 min. The tool can accurately infer repeat numbers at the standard 24 or 21/24 MIRU-VNTR loci from the complete or

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call