Abstract

BackgroundTuberculosis, caused by bacteria in the Mycobacterium tuberculosis complex (MTBC), is a major global public health burden. Strain-specific genomic diversity in the known lineages of MTBC is an important factor in pathogenesis that may affect virulence, transmissibility, host response and emergence of drug resistance. Fast and accurate tracking of MTBC strains is therefore crucial for infection control, and our previous work developed a 62-single nucleotide polymorphism (SNP) barcode to inform on the phylogenetic identity of 7 human lineages and 64 sub-lineages.MethodsTo update this barcode, we analysed whole genome sequencing data from 35,298 MTBC isolates (~ 1 million SNPs) covering 9 main lineages and 3 similar animal-related species (M. tuberculosis var. bovis, M. tuberculosis var. caprae and M. tuberculosis var. orygis). The data was partitioned into training (N = 17,903, 50.7%) and test (N = 17,395, 49.3%) sets and were analysed using an integrated phylogenetic tree and population differentiation (FST) statistical approach.ResultsBy constructing a phylogenetic tree on the training MTBC isolates, we characterised 90 lineages or sub-lineages or species, of which 30 are new, and identified 421 robust barcoding mutations, of which a minimal set of 90 was selected that included 20 markers from the 62-SNP barcode. The barcoding SNPs (90 and 421) discriminated perfectly the 86 MTBC isolate (sub-)lineages in the test set and could accurately reconstruct the clades across the combined 35k samples.ConclusionsThe validated 90 SNPs can be used for the rapid diagnosis and tracking of MTBC strains to assist public health surveillance and control. To facilitate this, the SNP markers have now been incorporated into the TB-Profiler informatics platform (https://github.com/jodyphelan/TBProfiler).

Highlights

  • Tuberculosis, caused by bacteria in the Mycobacterium tuberculosis complex (MTBC), is a major global public health burden

  • MTBC isolates, Single nucleotide polymorphism (SNP) and phylogeny Across a total of 35,298 MTBC isolates with sequencing data, we identified 1,014,762 high-quality SNPs

  • The isolates represented all MTBC lineages (1–9), M. bovis, M. orygis and M. caprae, but the majority were from lineages 4 (51.6%), 2 (25.2%), 3 (11.1%) and 1 (9.5%), with the frequency of others being at most 1% (Additional file 1: Table S1)

Read more

Summary

Introduction

Tuberculosis, caused by bacteria in the Mycobacterium tuberculosis complex (MTBC), is a major global public health burden. Strain-specific genomic diversity in the known lineages of MTBC is an important factor in pathogenesis that may affect virulence, transmissibility, host response and emergence of drug resistance. Tuberculosis, caused by bacteria in the Mycobacterium tuberculosis complex (MTBC), is a major global burden causing approximately ten million active cases and killing 1.5 million people in 2018 (www.who.int/tb). The MTBC consists of Mycobacterium tuberculosis sensu stricto (Mtb) (lineages 1, 2, 3, 4 and 7) and M. tuberculosis var. The MTBC lineages vary in their geographic distribution and spread, being endemic in different locations around the globe, leading to the hypothesis that the strain types are adapted to different human populations [4]. Lineage 4 is common in Europe and southern Africa, with regions of high TB incidence and high levels of HIV co-infection, whilst lineages 5, 6 and 7 appear isolated within West Africa and Ethiopia, respectively [1]

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.