Abstract

ABSTRACT Prokaryotic genomes harbour a plethora of uncharacterized reverse transcriptases (RTs). RTs phylogenetically related to those encoded by group-II introns have been found associated with type III CRISPR-Cas systems, adjacent or fused at the C-terminus to Cas1. It is thought that these RTs may have a relevant function in the CRISPR immune response mediating spacer acquisition from RNA molecules. The origin and relationships of these RTs and the ways in which the various protein domains evolved remain matters of debate. We carried out a large survey of annotated RTs in databases (198,760 sequences) and constructed a large dataset of unique representative sequences (9,141). The combined phylogenetic reconstruction and identification of the RTs and their various protein domains in the vicinity of CRISPR adaptation and effector modules revealed three different origins for these RTs, consistent with their emergence on multiple occasions: a larger group that have evolved from group-II intron RTs, and two minor lineages that may have arisen more recently from Retron/retron-like sequences and Abi-P2 RTs, the latter associated with type I-C systems. We also identified a particular group of RTs associated with CRISPR-cas loci in clade 12, fused C-terminally to an archaeo-eukaryotic primase (AEP), a protein domain (AE-Prim_S_like) forming a particular family within the AEP proper clade. Together, these data provide new insight into the evolution of CRISPR-Cas/RT systems.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call