Abstract

Motivation. Clustered regularly interspaced short palindromic repeat (CRISPR) is a genetic element with active regulation roles for foreign invasive genes in the prokaryotic genomes and has been engineered to work with the CRISPR-associated sequence (Cas) gene Cas9 as one of the modern genome editing technologies. Due to inconsistent definitions, the existing CRISPR detection programs seem to have missed some weak CRISPR signals. Results. This study manually curates all the currently annotated CRISPR elements in the prokaryotic genomes and proposes 95 updates to the annotations. A new definition is proposed to cover all the CRISPRs. The comprehensive comparison of CRISPR numbers on the taxonomic levels of both domains and genus shows high variations for closely related species even in the same genus. The detailed investigation of how CRISPRs are evolutionarily manipulated in the 8 completely sequenced species in the genus Thermoanaerobacter demonstrates that transposons act as a frequent tool for splitting long CRISPRs into shorter ones along a long evolutionary history.

Highlights

  • A Clustered regularly interspaced short palindromic repeat (CRISPR) is an array of repeat copies (DR, direct repeat) connected by fixed-length linker sequences [1]

  • The complete annotation of CRISPRs in microbial genomes was downloaded from the latest version of the database DbCRISPR [9], which was updated on April 14, 2014. 4,065 CRISPRs are annotated in the 2,762 genomes of bacteria and archaea

  • This study conducts a comprehensive curation of the current CRISPR annotation and proposes three types of revisions based on the observations that some annotated CRISPRs (1) have undetected DRs in the flanking regions, (2) are broken into two CRISPRs due to the nonstandard DRs or transposons in between, or (3) are annotated as two CRISPRs at the beginning of circular chromosomes

Read more

Summary

Introduction

A CRISPR is an array of repeat copies (DR, direct repeat) connected by fixed-length linker sequences [1]. Based on the database DbCRISPR, we made novel discoveries by manually analyzing, modifying, and correcting the above CRISPR results and investigated the lengths of CRISPR DRs and spacers.

Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call