Distantly related organisms may evolve similar traits when exposed to similar environments or engaging in certain lifestyles. Several members of the Lactobacillaceae [lactic acid bacteria (LAB)] family are frequently isolated from the floral niche, mostly from bees and flowers. In some floral LAB species (henceforth referred to as bee-associated LAB), distinctive genomic (e.g., genome reduction) and phenotypic (e.g., preference for fructose over glucose or fructophily) features were recently documented. These features are found across distantly related species, raising the hypothesis that specific genomic and phenotypic traits evolved convergently during adaptation to the floral environment. To test this hypothesis, we examined representative genomes of 369 species of bee-associated and non-bee-associated LAB. Phylogenomic analysis unveiled seven independent ecological shifts toward the bee environment in LAB. In these species, we observed significant reductions of genome size, gene repertoire, and GC content. Using machine leaning, we could distinguish bee-associated from non-bee-associated species with 94% accuracy, based on the absence of genes involved in metabolism, osmotic stress, or DNA repair. Moreover, we found that the most important genes for the machine learning classifier were seemingly lost, independently, in multiple bee-associated lineages. One of these genes, acetaldehyde-alcohol dehydrogenase (adhE), encodes a bifunctional aldehyde-alcohol dehydrogenase which has been associated with the evolution of fructophily, a rare phenotypic trait that is pervasive across bee-associated LAB species. These results suggest that the independent evolution of distinctive phenotypes in bee-associated LAB has been largely driven by independent losses of the same sets of genes.IMPORTANCESeveral LAB species are intimately associated with bees and exhibit unique biochemical properties with potential for food applications and honeybee health. Using a machine learning-based approach, our study shows that adaptation of LAB to the bee environment was accompanied by a distinctive genomic trajectory deeply shaped by gene loss. Several of these gene losses occurred independently in distantly related species and are linked to some of their unique biotechnologically relevant traits, such as the preference for fructose over glucose (fructophily). This study underscores the potential of machine learning in identifying fingerprints of adaptation and detecting instances of convergent evolution. Furthermore, it sheds light onto the genomic and phenotypic particularities of bee-associated bacteria, thereby deepening the understanding of their positive impact on honeybee health.
Read full abstract