Abstract

BackgroundCRISPR (clustered regularly interspaced short palindromic repeats) RNAs provide the specificity for noncoding RNA-guided adaptive immune defence systems in prokaryotes. CRISPR arrays consist of repeat sequences separated by specific spacer sequences. CRISPR arrays have previously been identified in a large proportion of prokaryotic genomes. However, currently available detection algorithms do not utilise recently discovered features regarding CRISPR loci.ResultsWe have developed a new approach to automatically detect, predict and interactively refine CRISPR arrays. It is available as a web program and command line from bioanalysis.otago.ac.nz/CRISPRDetect. CRISPRDetect discovers putative arrays, extends the array by detecting additional variant repeats, corrects the direction of arrays, refines the repeat/spacer boundaries, and annotates different types of sequence variations (e.g. insertion/deletion) in near identical repeats. Due to these features, CRISPRDetect has significant advantages when compared to existing identification tools. As well as further support for small medium and large repeats, CRISPRDetect identified a class of arrays with ‘extra-large’ repeats in bacteria (repeats 44–50 nt). The CRISPRDetect output is integrated with other analysis tools. Notably, the predicted spacers can be directly utilised by CRISPRTarget to predict targets.ConclusionCRISPRDetect enables more accurate detection of arrays and spacers and its gff output is suitable for inclusion in genome annotation pipelines and visualisation. It has been used to analyse all complete bacterial and archaeal reference genomes.Electronic supplementary materialThe online version of this article (doi:10.1186/s12864-016-2627-0) contains supplementary material, which is available to authorized users.

Highlights

  • clustered regularly interspaced short palindromic repeats (CRISPR) RNAs provide the specificity for noncoding RNA-guided adaptive immune defence systems in prokaryotes

  • These noncoding RNAs are derived from CRISPR arrays that possess near identical direct repeats, typically 21–48 bases long, punctuated by short non-identical ‘spacers’ that provide the immune ‘memory’ of these systems. [1,2,3,4,5,6]

  • Analysis of CRISPR-CRISPR associated (Cas) systems requires the detection of CRISPR arrays and their entire complement of spacer sequences

Read more

Summary

Introduction

CRISPR (clustered regularly interspaced short palindromic repeats) RNAs provide the specificity for noncoding RNA-guided adaptive immune defence systems in prokaryotes. CRISPR arrays consist of repeat sequences separated by specific spacer sequences. CRISPR-Cas (clustered regularly interspaced short palindromic repeats-CRISPR associated) systems are adaptive immune systems in prokaryotes that provide protection from foreign genetic material, such as bacteriophages and plasmids. Specificity is provided by short noncoding RNAs (termed crRNAs; CRISPR RNAs) that recognise the invading DNA or RNA. These noncoding RNAs are derived from CRISPR arrays that possess near identical direct repeats, typically 21–48 bases long, punctuated by short non-identical ‘spacers’ that provide the immune ‘memory’ of these systems. CRISPR prediction has been extended to metagenomic data [18,19,20]

Objectives
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.