We analyzed the pan-genome and gene content modulation of the most diverse genome data set of the Mycobacterium tuberculosis complex (MTBC) gathered to date. The closed pan-genome of the MTBC was characterized by reduced accessory and strain-specific genomes, compatible with its clonal nature. However, significantly fewer gene families were shared between MTBC genomes as their phylogenetic distance increased. This effect was only observed in inter-species comparisons, not within-species, which suggests that species-specific ecological characteristics are associated with changes in gene content. Gene loss, resulting from genomic deletions and pseudogenization, was found to drive the variation in gene content. This gene erosion differed among MTBC species and lineages, even within M. tuberculosis, where L2 showed more gene loss than L4. We also show that phylogenetic proximity is not always a good proxy for gene content relatedness in the MTBC, as the gene repertoire of Mycobacterium africanum L6 deviated from its expected phylogenetic niche conservatism. Gene disruptions of virulence factors, represented by pseudogene annotations, are mostly not conserved, being poor predictors of MTBC ecotypes. Each MTBC ecotype carries its own accessory genome, likely influenced by distinct selective pressures such as host and geography. It is important to investigate how gene loss confer new adaptive traits to MTBC strains; the detected heterogeneous gene loss poses a significant challenge in elucidating genetic factors responsible for the diverse phenotypes observed in the MTBC. By detailing specific gene losses, our study serves as a resource for researchers studying the MTBC phenotypes and their immune evasion strategies.IMPORTANCEIn this study, we analyzed the gene content of different ecotypes of the Mycobacterium tuberculosis complex (MTBC), the pathogens of tuberculosis. We found that changes in their gene content are associated with their ecological features, such as host preference. Gene loss was identified as the primary driver of these changes, which can vary even among different strains of the same ecotype. Our study also revealed that the gene content relatedness of these bacteria does not always mirror their evolutionary relationships. In addition, some genes of virulence can be variably lost among strains of the same MTBC ecotype, likely helping them to evade the immune system. Overall, our study highlights the importance of understanding how gene loss can lead to new adaptations in these bacteria and how different selective pressures may influence their genetic makeup.
Read full abstract