Abstract

BackgroundRNA-binding proteins interact with specific RNA molecules to regulate important cellular processes. It is therefore necessary to identify the RNA interaction partners in order to understand the precise functions of such proteins. Protein-RNA interactions are typically characterized using in vivo and in vitro experiments but these may not detect all binding partners. Therefore, computational methods that capture the protein-dependent nature of such binding interactions could help to predict potential binding partners in silico.ResultsWe have developed three methods to predict whether an RNA can interact with a particular RNA-binding protein using support vector machines and different features based on the sequence (the Oli method), the motif score (the OliMo method) and the secondary structure (the OliMoSS method). We applied these approaches to different experimentally-derived datasets and compared the predictions with RNAcontext and RPISeq. Oli outperformed OliMoSS and RPISeq, confirming our protein-specific predictions and suggesting that tetranucleotide frequencies are appropriate discriminative features. Oli and RNAcontext were the most competitive methods in terms of the area under curve. A precision-recall curve analysis achieved higher precision values for Oli. On a second experimental dataset including real negative binding information, Oli outperformed RNAcontext with a precision of 0.73 vs. 0.59.ConclusionsOur experiments showed that features based on primary sequence information are sufficiently discriminating to predict specific RNA-protein interactions. Sequence motifs and secondary structure information were not necessary to improve these predictions. Finally we confirmed that protein-specific experimental data concerning RNA-protein interactions are valuable sources of information that can be used for the efficient training of models for in silico predictions. The scripts are available upon request to the corresponding author.

Highlights

  • RNA-binding proteins interact with specific RNA molecules to regulate important cellular processes

  • Given a specific RNAbinding proteins (RBPs), we applied a Support Vector Machines (SVM) to discriminate binding from non-binding RNAs by describing each RNA sequence with a total of 525 features: the tetranucleotide frequencies and position-specific scoring matrices (PSSMs) scores described above, plus three additional secondary structure properties and 256 features representing the accessibility of different tetranucleotides

  • Evaluation 1 Table 2 shows the performance of Oli, OliMo, OliMoSS, RNAcontext, RPISeq-SVM and RPISeq-Random Forest (RF) on each RBP in the AURA_dataset

Read more

Summary

Introduction

RNA-binding proteins interact with specific RNA molecules to regulate important cellular processes. It is necessary to identify the RNA interaction partners in order to understand the precise functions of such proteins. Protein-RNA interactions are typically characterized using in vivo and in vitro experiments but these may not detect all binding partners. RBPs are involved in post-transcriptional regulation, RNA splicing, RNA stability and protein synthesis. This suggests that RBPs must interact with specific mRNA targets. A number of specific RBP target sites have been identified in the 3’-UTR [4]. The identification of RNA targets is interesting from a biological perspective because they provide insight into the precise functions of RBPs [2,6]. More accurate predictions of binding sites and the molecular characteristics of such interactions are highly informative [7]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call