Abstract

BackgroundMore and more Cas9 variants with higher specificity are developed to avoid the off-target effect, which brings a significant volume of experimental data. Conventional machine learning performs poorly on these datasets, while the methods based on deep learning often lack interpretability, which makes researchers have to trade-off accuracy and interpretability. It is necessary to develop a method that can not only match deep learning-based methods in performance but also with good interpretability that can be comparable to conventional machine learning methods.ResultsTo overcome these problems, we propose an intrinsically interpretable method called AttCRISPR based on deep learning to predict the on-target activity. The advantage of AttCRISPR lies in using the ensemble learning strategy to stack available encoding-based methods and embedding-based methods with strong interpretability. Comparison with the state-of-the-art methods using WT-SpCas9, eSpCas9(1.1), SpCas9-HF1 datasets, AttCRISPR can achieve an average Spearman value of 0.872, 0.867, 0.867, respectively on several public datasets, which is superior to these methods. Furthermore, benefits from two attention modules—one spatial and one temporal, AttCRISPR has good interpretability. Through these modules, we can understand the decisions made by AttCRISPR at both global and local levels without other post hoc explanations techniques.ConclusionWith the trained models, we reveal the preference for each position-dependent nucleotide on the sgRNA (short guide RNA) sequence in each dataset at a global level. And at a local level, we prove that the interpretability of AttCRISPR can be used to guide the researchers to design sgRNA with higher activity.

Highlights

  • More and more Clustered regularly interspaced short palindromic repeats (CRISPR) associated protein 9 (Cas9) variants with higher specificity are developed to avoid the off-target effect, which brings a significant volume of experimental data

  • 3) How can researchers understand the decisions made by AttCRISPR locally and globally, based on attention mechanisms?

  • We apply attention modules in both the spatial and temporal parts of AttCRISPR, and design two experiments combined with some early reports to prove that attention mechanisms can help researchers understand the decisions made by the model which makes it easy to optimize low activity sgRNA without exhaustive search

Read more

Summary

Introduction

More and more Cas variants with higher specificity are developed to avoid the off-target effect, which brings a significant volume of experimental data. Xiao et al BMC Bioinformatics (2021) 22:589 hindered the further clinical application of the CRISPR/Cas systems. One of these disadvantages is due to unexpected insertion and deletion caused by the off-target effect [4,5,6,7]. To overcome this disadvantage, one solution is to engineer CRISPR/Cas with higher specificity. The activity of chosen sgRNA sequence determines the efficiency of genome editing, this fact indicates that it is meaningful to develop an efficient approach to predict sgRNA activity and even guide sgRNA design

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.