Abstract

BackgroundSequencing of microbiomes has accelerated the characterization of the diversity of CRISPR-Cas immune systems. However, the utilization of next generation short read sequences for the characterization of CRISPR-Cas dynamics remains limited due to the repetitive nature of CRISPR arrays. CRISPR arrays are comprised of short spacer segments (derived from invaders’ genomes) interspaced between flanking repeat sequences. The repetitive structure of CRISPR arrays poses a computational challenge for the accurate assembly of CRISPR arrays from short reads. In this paper we evaluate the use of long read sequences for the analysis of CRISPR-Cas system dynamics in microbiomes.ResultsWe analyzed a dataset of Illumina’s TruSeq Synthetic Long-Reads (SLR) derived from a gut microbiome. We showed that long reads captured CRISPR spacers at a high degree of redundancy, which highlights the spacer conservation of spacer sharing CRISPR variants, enabling the study of CRISPR array dynamics in ways difficult to achieve though short read sequences. We introduce compressed spacer graphs, a visual abstraction of spacer sharing CRISPR arrays, to provide a simplified view of complex organizational structures present within CRISPR array dynamics. Utilizing compressed spacer graphs, several key defining characteristics of CRISPR-Cas system dynamics were observed including spacer acquisition and loss events, conservation of the trailer end spacers, and CRISPR arrays’ directionality (transcription orientation). Other result highlights include the observation of intense array contraction and expansion events, and reconstruction of a full-length genome for a potential invader (Faecalibacterium phage) based on identified spacers.ConclusionWe demonstrate in an in silico system that long reads provide the necessary context for characterizing the organization of CRISPR arrays in a microbiome, and reveal dynamic and evolutionary features of CRISPR-Cas systems in a microbial population.

Highlights

  • Sequencing of microbiomes has accelerated the characterization of the diversity of Clustered regularly interspaced short palindromic repeats (CRISPR)-CRISPR-associated gene (Cas) immune systems

  • Using the computational tool that we have previously developed for the characterization of CRISPR-Cas systems [44], combined with new tools we developed for comparing and visualizing the CRISPR arrays, we study the dynamics of CRISPR arrays using long reads

  • Comparison of the results showed that long reads contain necessary genomic contexts for analyzing CRISPR organizations, owing to the facts that CRISPR repeats and spacers are typically short and a CRISPR array typically contains a few or up to a few dozens of spacer-repeat units

Read more

Summary

Introduction

Sequencing of microbiomes has accelerated the characterization of the diversity of CRISPR-Cas immune systems. As invading mobile genetic elements constantly find means to infiltrate their hosts, it becomes unsurprising that prokaryotes have evolved a multitude of means to defend against such invaders [1,2,3]. One such defense mechanism is the CRISPR-Cas system, an adaptive sequence-specific immune system present in about half of the bacterial and most of the archaeal genera [4,5,6,7,8]. To the evolutionary diversity of CRISPR-Cas systems, invaders such as phages have been observed to evolve in tandem to evade host defense mechanisms, such as anti-CRISPR genes which are among some of the recently discovered mechanisms [1, 2, 14,15,16,17]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call