Abstract

Bacterial secondary metabolites, synthesized by enzymes encoded in biosynthetic gene clusters (BGCs), can underlie microbiome homeostasis and serve as commercialized products, which have historically been mined from a select group of taxa. While evolutionary approaches have proven beneficial for prioritizing BGCs for experimental characterization efforts to uncover new natural products, dedicated bioinformatics tools designed for comparative and evolutionary analysis of BGCs within focal taxa are limited. We thus developedlineagespecificanalysis of BGCs (lsaBGC;https://github.com/Kalan-Lab/lsaBGC) to aid exploration of microdiversity and evolutionary trends across homologous groupings of BGCs, gene cluster families (GCFs), in any bacterial taxa of interest.lsaBGC enables rapid and direct identification of GCFs in genomes, calculates evolutionary statistics and conservation for BGC genes, and builds a framework to allow for base resolution mining of novel variants through metagenomic exploration. Through application of the suite to four genera commonly found in skin microbiomes, we uncover new insights into the evolution and diversity of their BGCs. We show that the BGC of the virulence-associated carotenoid staphyloxanthin inStaphylococcus aureusis ubiquitous across the genusStaphylococcus. While one GCF encoding the biosynthesis of staphyloxanthin showcases evidence for plasmid-mediated horizontal gene transfer (HGT) between species, another GCF appears to be transmitted vertically amongst a sub-clade of skin-associatedStaphylococcus. Further, the latter GCF, which is well conserved inS. aureus, has been lost in mostStaphylococcus epidermidis, which is the most commonStaphylococcusspecies on human skin and is also regarded as a commensal. We also identify thousands of novel single-nucleotide variants (SNVs) within BGCs from theCorynebacterium tuberculostearicumsp. complex, a narrow, multi-species clade that features the most prevalentCorynebacteriumin healthy skin microbiomes. Although novel SNVs were approximately 10 times as likely to correspond to synonymous changes when located in the top five percentile of conserved sites,lsaBGC identified SNVs that defied this trend and are predicted to underlie amino acid changes within functionally key enzymatic domains. Ultimately, beyond supporting evolutionary investigations of BGCs,lsaBGC also provides important functionalities to aid efforts for the discovery or directed modification of natural products.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call