Abstract

The S. pyogenes (Sp) Cas9 endonuclease is an important gene-editing tool. SpCas9 is directed to target sites based on complementarity to a complexed single-guide RNA (sgRNA). However, SpCas9-sgRNA also binds and cleaves genomic off-targets with only partial complementarity. To date, we lack the ability to predict cleavage and binding activity quantitatively, and rely on binary classification schemes to identify strong off-targets. We report a quantitative kinetic model that captures the SpCas9-mediated strand-replacement reaction in free-energy terms. The model predicts binding and cleavage activity as a function of time, target, and experimental conditions. Trained and validated on high-throughput bulk-biochemical data, our model predicts the intermediate R-loop state recently observed in single-molecule experiments, as well as the associated conversion rates. Finally, we show that our quantitative activity predictor can be reduced to a binary off-target classifier that outperforms the established state-of-the-art. Our approach is extensible, and can characterize any CRISPR-Cas nuclease – benchmarking natural and future high-fidelity variants against SpCas9; elucidating determinants of CRISPR fidelity; and revealing pathways to increased specificity and efficiency in engineered systems.

Highlights

  • The S. pyogenes (Sp) Cas[9] endonuclease is an important gene-editing tool

  • The reaction starts with Cas9-single-guide RNA (sgRNA) ribonucleoprotein complex exiting the solution state to bind to a 3nt protospacer adjacent motif (PAM) DNA sequence—canonically 5’-NGG-3’— via protein-DNA interactions[44,45]

  • Binding to the PAM sequence opens the DNA double helix, and allows the first base of the target sequence to hybridize with the sgRNA44,45, forming the first R-loop state

Read more

Summary

Introduction

The S. pyogenes (Sp) Cas[9] endonuclease is an important gene-editing tool. SpCas[9] is directed to target sites based on complementarity to a complexed single-guide RNA (sgRNA). Strong off-target sites are identified in silico by a growing set of tools These tools use bioinformatics[20,21], machine learning[22,23], or heuristic[12,14,24,25] approaches to rank genomic sites based on distinctive off-target activity scores. Though such models can identify strong off-targets, they are not quantitative and cannot assess activity on the many lesser off-targets; nor can they predict how activity changes with exposure time and enzyme concentration—even though these parameters are frequently exploited to limit off-target activity in cells[26]. Our characterization of Cas[9] supports the notion that observed differences in binding and cleavage activities[32–41] stem from a relatively longlived DNA-bound RNA-DNA hybrid (R-loop) intermediate This R-loop intermediate was recently observed directly in singlemolecule experiments[42], and our model predicts both its location and its conversion rates

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call