BackgroundGenetic discontinuity represents abrupt breaks in genomic identity among species. Advances in genome sequencing have enhanced our ability to track and characterize genetic discontinuity in bacterial populations. However, exploring the degree to which bacterial diversity exists as a continuum or sorted into discrete and readily defined species remains a challenge in microbial ecology. Here, we aim to quantify the genetic discontinuity (δ) and investigate how this metric is related to ecology.ResultsWe harness a dataset comprising 210,129 genomes to systematically explore genetic discontinuity patterns across several distantly related species, finding clear breakpoints which vary depending on the taxa in question. By delving into pangenome characteristics, we uncover a significant association between pangenome saturation and genetic discontinuity. Closed pangenomes are associated with more pronounced breaks, exemplified by Mycobacterium tuberculosis. Additionally, through a machine learning approach, we detect key features such as gene conservation patterns and functional annotations that significantly impact genetic discontinuity prediction.ConclusionsOur study clarifies bacterial genetic patterns and their ecological impacts, enhancing the delineation of species boundaries and deepening our understanding of microbial diversity.
Read full abstract