Abstract
RNA-binding proteins (RBPs) play a crucial role in the post-transcriptional regulation of RNAs. Identification of RBP binding sites is a key step to understand the biological mechanism of post-transcriptional regulation. Although many computational methods have been developed for predicting RNA-protein binding sites, few study considers the k-mer embedding representation of RNA primary sequence and secondary structure specificities. In this paper, we develop a general deep learning framework, named deepRKE, to predict RNA-protein binding sites. deepRKE takes an unsupervised shallow two-layer neural network to automatically learn the distributed representation of k-mers by taking their neighbor context into account. Compared to conventional k-mers approach, distributed representations effectively detect the latent relationship and similarity between k-mers. The distributed representations of the sequences and secondary structures are fed into CNN convolutional neural network (CNN) and a bidirectional long short term memory network (BLSTM) to discriminate the RBP binding sites from unbound sites. We comprehensively evaluate deepRKE on two large-scale RBP binding sites datasets, and the experimental results show that deepRKE achieves better performance than five competitive methods.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.