Abstract

Clustered regularly interspaced short palindromic repeats (CRISPRs) are a family of DNA direct repeats found in many prokaryotic genomes. Repeats of 21–37 bp typically show weak dyad symmetry and are separated by regularly sized, nonrepetitive spacer sequences. Four CRISPR-associated (Cas) protein families, designated Cas1 to Cas4, are strictly associated with CRISPR elements and always occur near a repeat cluster. Some spacers originate from mobile genetic elements and are thought to confer “immunity” against the elements that harbor these sequences. In the present study, we have systematically investigated uncharacterized proteins encoded in the vicinity of these CRISPRs and found many additional protein families that are strictly associated with CRISPR loci across multiple prokaryotic species. Multiple sequence alignments and hidden Markov models have been built for 45 Cas protein families. These models identify family members with high sensitivity and selectivity and classify key regulators of development, DevR and DevS, in Myxococcus xanthus as Cas proteins. These identifications show that CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a repeat cluster or filling the region between two repeat clusters. Distinctive subsets of the collection of Cas proteins recur in phylogenetically distant species and correlate with characteristic repeat periodicity. The analyses presented here support initial proposals of mobility of these units, along with the likelihood that loci of different subtypes interact with one another as well as with host cell defensive, replicative, and regulatory systems. It is evident from this analysis that CRISPR/cas loci are larger, more complex, and more heterogeneous than previously appreciated.

Highlights

  • Clusters of short DNA repeats with nonhomologous spacers, which are found at regular intervals in the genomes of phylogenetically distinct prokaryotic species, comprise a family with recognizable features [1,2,3]

  • The guild is presumed to be involved in processes that may include the maintenance of repeat clusters [3], capture of new spacer elements [12,13] and expansion or contraction of clusters, propagation of the leader sequence and repeat clusters within a genome [3,7], transfer of clustered regularly interspaced short palindromic repeat (CRISPR) and cas genes together to new genomes [8,14,15], and interaction of CRISPR/cas loci with the host cell

  • Many of these families contain members that belong to clusters of orthologous groups (COGs) [14,16], the relationship between the hidden Markov model (HMM) described here and these COGs is imperfect

Read more

Summary

Introduction

Clusters of short DNA repeats with nonhomologous spacers, which are found at regular intervals in the genomes of phylogenetically distinct prokaryotic species, comprise a family with recognizable features [1,2,3] These repeats were first observed by Ishino and colleagues [4] upstream of the iap gene in Escherichia coli and later in other bacterial and archaeal species such as Haloferax mediterranei, Streptococcus pyogenes, and Mycobacterium tuberculosis. Similar repeats were identified in the genome of the hyperthermophilic bacterium Thermotoga maritima [8] The association of these repeats with nearby gene clusters that showed closest similarity to archaeal species and their atypical DNA composition (as measured by v2 analysis) were called consistent with other observations as evidence of lateral gene transfer (LGT) between archaeal and bacterial species [8]. These findings suggested transfer of repeat-associated DNA within and between prokaryotic genomes

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call