BackgroundBy using whole genome sequencing (WGS), researchers are beginning to understand the genetic diversity of Mycobacterium tuberculosis (MTB) and its consequences for the diagnosis of multidrug-resistant tuberculosis (MDR–TB) on a genomic scale. The Global Consortium for Drug-resistant TB Diagnostics (GCDD) conducted a genome scale variant analyses of 366 clinical MTB genomes (mostly MDR/XDR [extensively drug resistant]) from four countries in order to inform the development of rapid molecular diagnostics. This project has been extended by performing an evolutionary analysis of isoniazid (INH)-resistant isolates for prognostic purposes. Methods151 (130 INHR, 21 INHS) clinical MTB isolates from India (19: 17 INHR, 2 INHS), Moldova (48: 42 INHR, 6 INHS), the Philippines (26: 20 INHR, 6 INHS), and South Africa (58: 51 INHR, 7 INHS) were included in this study. INH drug susceptibility was determined by using MGIT 960 and WHO (World Health Organization)-recommended critical concentration of 0.1mg/L. Isolates were sequenced using PacBio RS WGS platform. A genome-wide variant analysis was conducted using a proprietary pipeline (PacDAP) developed at San Diego State University. To infer the amino acid changes in katG that confer resistance, PAML was utilized to detect sites in silico that are under positive selection. The dN/dS method was used in combination with Bayes empirical Bayes to determine sites under positive selection and Chi-Squared analysis to determine the significance of the selected sites. ResultsPacDAP variant analysis revealed 22 novel catalase-peroxidase (katG product) mutations. Of these, 14 were single nucleotide polymorphisms, while 8 novel mutations appeared in combination with katG S315T and/or with inhA promoter C-15T. These SNPs have not been previously reported. Additionally, 11 previously observed, but uncommon, katG mutations were also observed in these clinical isolates. These results suggest that 17 amino acids in the enzyme are under positive selective pressure; most significantly in South Africa and the Philippines. No selective pressure on codons other than 315 was observed in isolates from Moldova. Due to the low number of isolates from India, the significance of the sites under positive selection was low and no prediction for India could be made based on this study. ConclusionsEleven of the 14 SNPs are resistance conferring, and it is believed that the remaining 8 combinatorial mutations are either compensatory in nature or, in combination with known SNPs, could increase resistance levels. Positive selection results indicate a diversifying evolutionary path to resistance more in line with long tail statistics and therefore indicate a departure from the traditional point mutation (or “hotspot”) model that current molecular diagnostics are based on. Positive selection pressures indicate a future with elevated diagnostic and prognostic significance of the “long tail” (i.e., alternative mechanisms of resistance) and potentially diminishing significance of the canonical mutations (especially in South Africa and the Philippines), which could have significant future implications on narrowly targeting molecular diagnostics.