Abstract

The CRISPR/Cas9-sgRNA system has recently become a popular tool for genome editing and a very hot topic in the field of medical research. In this system, Cas9 protein is directed to a desired location for gene engineering and cleaves target DNA sequence which is complementary to a 20-nucleotide guide sequence found within the sgRNA. A lot of experimental efforts, ranging from in vivo selection to in silico modeling, have been made for efficient designing of sgRNAs in CRISPR/Cas9 system. In this article, we present a novel tool, called CRISPRpred, for efficient in silico prediction of sgRNAs on-target activity which is based on the applications of Support Vector Machine (SVM) model. To conduct experiments, we have used a benchmark dataset of 17 genes and 5310 guide sequences where there are only 20% true values. CRISPRpred achieves Area Under Receiver Operating Characteristics Curve (AUROC-Curve), Area Under Precision Recall Curve (AUPR-Curve) and maximum Matthews Correlation Coefficient (MCC) as 0.85, 0.56 and 0.48, respectively. Our tool shows approximately 5% improvement in AUPR-Curve and after analyzing all evaluation metrics, we find that CRISPRpred is better than the current state-of-the-art. CRISPRpred is enough flexible to extract relevant features and use them in a learning algorithm. The source code of our entire software with relevant dataset can be found in the following link: https://github.com/khaled-buet/CRISPRpred.

Highlights

  • Genome-editing technology has become very popular in recent years and it has significantly caught the sight of scientific community [1]

  • We compare experimental results of CRISPRpred with Azimuth which is currently the state-of-the-art and show the effectiveness and efficacy of the newly introduced tool. It has been observed in the literature that the mutagenic performance of Clustered Regularly Inter-spaced Short Palindromic Repeats (CRISPR)/Cas9 system differs significantly due to a small change in single-guide RNAs (sgRNAs) [8]

  • We have focused on both position-specific sequences of adjacent nucleotides and secondary structures to construct features from sgRNAs

Read more

Summary

Introduction

Genome-editing technology has become very popular in recent years and it has significantly caught the sight of scientific community [1]. Rapid growth of a number of development tools makes this interesting biological phenomenon clear and helps us obtain desirable biological systems. The Clustered Regularly Inter-spaced Short Palindromic Repeats (CRISPR) and their associated endonucleas genes (Cas9) have been recently demonstrated to be a revolutionary technology for genome editing in mammalian cells [2, 3]. CRISPR/Cas technology functions against viral infections or other types of horizontal gene transfer by cutting down foreign. Analysis, decision to publish, or preparation of the manuscript

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call