Abstract
Circular RNAs (circRNAs) are extensively expressed in cells and tissues, and play crucial roles in human diseases and biological processes. Recent studies have reported that circRNAs could function as RNA binding protein (RBP) sponges, meanwhile RBPs can also be involved in back-splicing. The interaction with RBPs is also considered an important factor for investigating the function of circRNAs. Hence, it is necessary to understand the interaction mechanisms of circRNAs and RBPs, especially in human cancers. Here, we present a novel method based on deep learning to identify cancer-specific circRNA–RBP binding sites (CSCRSites), only using the nucleotide sequences as the input. In CSCRSites, an architecture with multiple convolution layers is utilized to detect the features of the raw circRNA sequence fragments, and further identify the binding sites through a fully connected layer with the softmax output. The experimental results show that CSCRSites outperform the conventional machine learning classifiers and some representative deep learning methods on the benchmark data. In addition, the features learnt by CSCRSites are converted to sequence motifs, some of which can match to human known RNA motifs involved in human diseases, especially cancer. Therefore, as a deep learning-based tool, CSCRSites could significantly contribute to the function analysis of cancer-associated circRNAs.
Highlights
Circular RNAs are non-coding RNAs that have covalent and closed loop structures; thereby, they are more stable than most linear RNAs in cells [1]
In order to evaluate the performance of CSCRSites, it was compared with conventional machine learning classifiers and some existing representative deep learning-based methods for detecting RNA binding protein (RBP) binding sites using the benchmark dataset CSCRBS
Since the hyper-parameters of a deep learning model have a significant impact on its performance, we studied the different combinations of model settings and selected the model parameters with the best performance
Summary
Circular RNAs (circRNAs) are non-coding RNAs that have covalent and closed loop structures; thereby, they are more stable than most linear RNAs in cells [1]. CircRNAs have been identified over twenty years, the biological functions of circRNAs remain largely unknown. An abundance and diversity of circRNAs were discovered in tissue and organ development by high-throughput sequencing [3,4], including many tissue-specific [5] and cell-specific [6] circRNAs, which may play a role in various human disorders and biological processes [7]. Several databases of circRNAs have been built for studying circRNAs. For instance, circBase collects and unifies data sets of circRNAs and provides scripts to identify circRNAs in sequencing data [12]. CircRNADb provides detailed annotations of human circRNAs, including genomic information, exon splicing, genome sequence, internal ribosome entry site (IRES), open reading frame (ORF), and references [13].
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.