BackgroundBacteria and archaea produce an enormous diversity of modified peptides that are involved in various forms of inter-microbial conflicts or communication. A vast class of such peptides are Ribosomally synthesized, Postranslationally modified Peptides (RiPPs), and a major group of RiPPs are graspetides, so named after ATP-grasp ligases that catalyze the formation of lactam and lactone linkages in these peptides. The diversity of graspetides, the multiple proteins encoded in the respective Biosynthetic Gene Clusters (BGCs) and their evolution have not been studied in full detail. In this work, we attempt a comprehensive analysis of the graspetide-encoding BGCs and report a variety of novel graspetide groups as well as ancillary proteins implicated in graspetide biosynthesis and expression.ResultsWe compiled a comprehensive, manually curated set of graspetides that includes 174 families including 115 new families with distinct patterns of amino acids implicated in macrocyclization and further modification, roughly tripling the known graspetide diversity. We derived signature motifs for the leader regions of graspetide precursors that could be used to facilitate graspetide prediction. Graspetide biosynthetic gene clusters and specific precursors were identified in bacterial divisions not previously known to encode RiPPs, in particular, the parasitic and symbiotic bacteria of the Candidate phyla radiation. We identified Bacteroides-specific biosynthetic gene clusters (BGC) that include remarkable diversity of graspetides encoded in the same loci which predicted to be modified by the same ATP-grasp ligase. We studied in details evolution of recently characterized chryseoviridin BGCs and showed that duplication and horizonal gene exchange both contribute to the diversification of the graspetides during evolution.ConclusionsWe demonstrate previously unsuspected diversity of graspetide sequences, even those associated with closely related ATP-grasp enzymes. Several previously unnoticed families of proteins associated with graspetide biosynthetic gene clusters are identified. The results of this work substantially expand the known diversity of RiPPs and can be harnessed to further advance approaches for their identification.