Abstract

Clustered regularly interspaced short palindromic repeats (CRISPR) and their associated (Cas) proteins constitute the CRISPR-Cas systems, which play a key role in prokaryote adaptive immune system against invasive foreign elements. In recent years, the CRISPR-Cas systems have also been designed to facilitate target gene editing in eukaryotic genomes. As one of the important components of the CRISPR-Cas system, Cas protein plays an irreplaceable role. The effector module composed of Cas proteins is used to distinguish the type of CRISPR-Cas systems. Effective prediction and identification of Cas proteins can help biologists further infer the type of CRISPR-Cas systems. Moreover, the class 2 CRISPR-Cas systems are gradually applied in the field of genome editing. The discovery of Cas protein will help provide more candidates for genome editing. In this paper, we described a web service named CASPredict (http://i.uestc.edu.cn/caspredict/cgi-bin/CASPredict.pl) for identifying Cas proteins. CASPredict first predicts Cas proteins based on support vector machine (SVM) by using the optimal dipeptide composition and then annotates the function of Cas proteins based on the hmmscan search algorithm. The ten-fold cross-validation results showed that the 84.84% of Cas proteins were correctly classified. CASPredict will be a useful tool for the identification of Cas proteins, or at least can play a complementary role to the existing methods in this area.

Highlights

  • The CRISPR-Cas systems consist of clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated (Cas) proteins (Hille et al, 2018)

  • We called the optimal features obtained through feature selection as optimal dipeptide composition (DPC) (ODPC)

  • In the ten-fold cross-validation, the models were built based on the DPC and ODPC (ODPC selected by using F-score and support vector machine (SVM)), respectively

Read more

Summary

Introduction

The CRISPR-Cas systems consist of clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated (Cas) proteins (Hille et al, 2018). The lack of adequate DNA sequence data, especially for mobile genetic elements, made it nearly impossible to predict the biological function of these abnormal repeated sequences (Ishino, Krupovic & Forterre, 2018). With the continuous development of genomics, Barrangou et al (2007) experimentally demonstrated that the CRISPR and related Cas genes work together against phages. This experiment confirmed the function of the CRISPR-Cas system as a prokaryotic acquired immune system. The CRISPR-Cas systems have gradually become a research hotspot in the field of gene editing.

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call