Abstract
RNA-protein interactions (RPIs) have critical roles in numerous fundamental biological processes, such as post-transcriptional gene regulation, viral assembly, cellular defence and protein synthesis. As the number of available RNA-protein binding experimental data has increased rapidly due to high-throughput sequencing methods, it is now possible to measure and understand RNA-protein interactions by computational methods. In this study, we integrate a sequence-based derived kernel with regularized least squares to perform prediction. The derived kernel exploits the contextual information around an amino acid or a nucleic acid as well as the repetitive conserved motif information. We propose a novel machine learning method, called RPiRLS to predict the interaction between any RNA and protein of known sequences. For the RPiRLS classifier, each protein sequence comprises up to 20 diverse amino acids but for the RPiRLS-7G classifier, each protein sequence is represented by using 7-letter reduced alphabets based on their physiochemical properties. We evaluated both methods on a number of benchmark data sets and compared their performances with two newly developed and state-of-the-art methods, RPI-Pred and IPMiner. On the non-redundant benchmark test sets extracted from the PRIDB, the RPiRLS method outperformed RPI-Pred and IPMiner in terms of accuracy, specificity and sensitivity. Further, RPiRLS achieved an accuracy of 92% on the prediction of lncRNA-protein interactions. The proposed method can also be extended to construct RNA-protein interaction networks. The RPiRLS web server is freely available at http://bmc.med.stu.edu.cn/RPiRLS.
Highlights
The interactions of proteins with other proteins, peptides, DNAs and RNAs govern most the essential molecular function
Regulated the HOXD locus in trans by interacting with PcG proteins [14]; several long non-codingRNAs (lncRNAs) were shown to be able to interact with AUF1, a protein linked to aging and cancer [15]; lncRNAs binding to JARID2 protein were essential for the recruitment of PRC2 to the chromatin [16]; lncRNA GAS5 inhibited hepatitis C virus replication by decoying HCV NS3 protein [17]
We propose a novel machine learning method, which we call RNA-protein interaction prediction based on regularized least squares (RPiRLS), to quantitatively predict the potential RNA-protein interactions
Summary
The interactions of proteins with other proteins, peptides, DNAs and RNAs govern most the essential molecular function. RNA-protein interactions (RPIs) have a critical influence on post-transcriptional gene regulation [1,2,3], viral assembly [4,5,6], cellular defence [7], protein synthesis [8,9] and various other fundamental biological processes [10,11]. LncRNAs normally function with their interacting proteins [13]. The study of RPIs is essential for understanding their functions. Compared to those of protein-protein interactions and DNA-protein interactions, current knowledge regarding RNA-protein interactions, especially lncRNA-protein
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.