Abstract

BackgroundRNA-binding proteins (RBPs) play crucial roles in post-transcriptional control of RNA. RBPs are designed to efficiently recognize specific RNA sequences after it is derived from the DNA sequence. To satisfy diverse functional requirements, RNA binding proteins are composed of multiple blocks of RNA-binding domains (RBDs) presented in various structural arrangements to provide versatile functions. The ability to computationally predict RNA-binding residues in a RNA-binding protein can help biologists reveal important site-directed mutagenesis in wet-lab experiments.ResultsThe proposed prediction framework named “ProteRNA” combines a SVM-based classifier with conserved residue discovery by WildSpan to identify the residues that interact with RNA in a RNA-binding protein. Although these conserved residues can be either functionally conserved residues or structurally conserved residues, they provide clues on the important residues in a protein sequence. In the independent testing dataset, ProteRNA has been able to deliver overall accuracy of 89.78%, MCC of 0.2628, F-score of 0.3075, and F0.5-score of 0.3546.ConclusionsThis article presents the design of a sequence-based predictor aiming to identify the RNA-binding residues in a RNA-binding protein by combining machine learning and pattern mining approaches. RNA-binding proteins have diverse functions while interacting with different categories of RNAs because these proteins are composed of multiple copies of RNA-binding domains presented in various structural arrangements to expand the functional repertoire of RNA-binding proteins. Furthermore, predicting RNA-binding residues in a RNA-binding protein can help biologists reveal important site-directed mutagenesis in wet-lab experiments.

Highlights

  • RNA-binding proteins (RBPs) play crucial roles in post-transcriptional control of RNA

  • * Correspondence: ckhuang@ntu.edu.tw † Contributed 2Department of Engineering Science and Oceanic Engineering, National Taiwan University, Taipei, Taiwan, Republic of China Full list of author information is available at the end of the article satisfying diverse functional requirements, RNA binding proteins are composed of multiple blocks of RNA-binding domains (RBDs) presented in various structural arrangements to provide versatile functionality [1,2]

  • We proposed the prediction framework “ProteRNA” with the combination of support vector machine (SVM)-based classifier with evolutionary profiles and conserved residues discovery by sequence conservation for identifying RNA-interacting residues in a RNA-binding protein

Read more

Summary

Introduction

RNA-binding proteins (RBPs) play crucial roles in post-transcriptional control of RNA. Kumar et al (2008) developed Pprint [23] by using evolutionary profiles of the position-specific scoring matrices (PSSMs) and amino acid composition while they adjusted cutoff value of SVM discrimination function to improve prediction performance. Spriggs et al (2009) [26] developed the PiRaNhA by using support vector machine with a PSSM profile and three amino acid properties, including interface propensity (IP), predicted solvent accessibility (pA) and hydrophobicity (H) for recognizing RNA-binding residues [27]. Jeong et al (2004) [28] applied artificial neural network (ANN)-based method with amino acid sequence and predicted secondary structure information and improved the performance by using post-processing procedures such as state-shifting and filtering isolated interacting residues from prediction. The ability to computationally predict RNA-binding residues in a RNA-binding protein can help biologists reveal sitedirected mutagenesis in wet-lab experiments

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call