Abstract

Mycobacterium tuberculosis complex (MTBC) species evolve slowly, so isolates from individuals linked in transmission often have identical or nearly identical genomes, making it difficult to reconstruct transmission chains. Finding additional sources of shared MTBC variation could help overcome this problem. Previous studies have reported MTBC diversity within infected individuals; however, whether within-host variation improves transmission inferences remains unclear. Here, we aimed to quantify within-host MTBC variation and assess whether such information improves transmission inferences. We conducted a retrospective genomic epidemiology study in which we reanalysed publicly available sequence data from household transmission studies published in PubMed from database inception until Jan 31, 2024, for which both genomic and epidemiological contact data were available, using household membership as a proxy for transmission linkage. Wequantified minority variants (ie, positions with two or more alleles each supported by at least five-fold coverage and with a minor allele frequency of 1% or more) outside of PE and PPE genes, within individual samples and shared across samples. We used receiver operator characteristic (ROC) curves to compare the performance of a general linear model for household membership that included shared minority variants and one that included only fixed genetic differences. We identified three MTBC household transmission studies with publicly available whole-genome sequencing data and epidemiological linkages: a household transmission study in Vitória, Brazil (Colangeli etal), a retrospective population-based study of paediatric tuberculosis in British Columbia, Canada (Guthrie etal), and a retrospective population-based study in Oxfordshire, England (Walker etal). We found moderate levels of minority variation present in MTBC sequence data from cultured isolates that varied significantly across studies: mean 168·6minority variants (95% CI 151·4-185·9) for the Colangeli etal dataset, 5·8 (1·5-10·2) for Guthrie etal (p<0·0001, Wilcoxon rank sum test, vsColangeli etal), and 7·1 (2·4-11·9) for Walker etal (p<0·0001, Wilcoxon rank sum test, vs Colangeli etal). Isolates from household pairs shared more minority variants than did randomly selected pairs of isolates: mean 97·7shared minority variants (79·1-116·3) versus 9·8 (8·6-11·0) in Colangeli etal, 0·8 (0·1-1·5) versus 0·2(0·1-0·2) in Guthrie etal, and 0·7 (0·1-1·3) versus 0·2 (0·2-0·2) in Walker etal (allp<0·0001, Wilcoxon rank sumtest). Shared within-host variation was significantly associated with household membership (odds ratio 1·51 [95% CI 1·30-1·71], p<0·0001), for one standard deviation increase in shared minority variants. Models that included shared within-host variation versus models without within-host variation improved the accuracy of predicting household membership in all three studies: area under the ROC curve 0·95versus 0·92for the Colangeli etal study, 0·99versus 0·95for the Guthrie etal study, and 0·93versus 0·91for the Walker etal study. Within-host MTBC variation persists through culture of sputum and could enhance the resolution of transmission inferences. The substantial differences in minority variation recovered across studies highlight the need to optimise approaches to recover and incorporate within-host variation into automated phylogenetic and transmission inference. National Institutes of Health.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.