Abstract
Accumulating evidence shows that circular RNAs (circRNAs) have significant roles in human health and in the occurrence and development of diseases. Biological researchers have identified disease-related circRNAs that could be considered as potential biomarkers for clinical diagnosis, prognosis, and treatment. However, identification of circRNA–disease associations using traditional biological experiments is still expensive and time-consuming. In this study, we propose a novel method named MSFCNN for the task of circRNA–disease association prediction, involving two-layer convolutional neural networks on a feature matrix that fuses multiple similarity kernels and interaction features among circRNAs, miRNAs, and diseases. First, four circRNA similarity kernels and seven disease similarity kernels are constructed based on the biological or topological properties of circRNAs and diseases. Subsequently, the similarity kernel fusion method is used to integrate the similarity kernels into one circRNA similarity kernel and one disease similarity kernel, respectively. Then, a feature matrix for each circRNA–disease pair is constructed by integrating the fused circRNA similarity kernel and fused disease similarity kernel with interactions and features among circRNAs, miRNAs, and diseases. The features of circRNA–miRNA and disease–miRNA interactions are selected using principal component analysis. Finally, taking the constructed feature matrix as an input, we used two-layer convolutional neural networks to predict circRNA–disease association labels and mine potential novel associations. Five-fold cross validation shows that our proposed model outperforms conventional machine learning methods, including support vector machine, random forest, and multilayer perception approaches. Furthermore, case studies of predicted circRNAs for specific diseases and the top predicted circRNA–disease associations are analyzed. The results show that the MSFCNN model could be an effective tool for mining potential circRNA–disease associations.
Highlights
Circular RNAs are a type of endogenous noncoding RNA with continuous covalently closed loop structures, which are produced by back-splicing or lariat events in genes (Barrett et al, 2015)
The results indicate that the MSFCNN model outperforms several conventional machine learning classifiers
Case studies of breast cancer, colorectal cancer, hepatocellular carcinoma, and acute myeloid leukemia indicate that MSFCNN could be an effective tool to infer potential circRNA–disease associations
Summary
Circular RNAs (circRNAs) are a type of endogenous noncoding RNA with continuous covalently closed loop structures, which are produced by back-splicing or lariat events in genes (Barrett et al, 2015). Lei et al (2018) developed a path-weighted model to predict circRNA– disease associations based on circRNA semantic similarity and disease functional similarity (Lei et al, 2018). Several circRNAs can bind with the corresponding miRNAs and participate in multiple biological processes synchronously (Qu et al, 2018) Based on this theory, Fang and Lei (2019) used an improved random walk algorithm to predict circRNA– miRNA associations, named KRWRMC (Fang and Lei, 2019). The CSCRSites model was proposed to predict cancer-specific protein binding sites on circRNAs based on CNNs. The features learned by the CSCRSites model are converted to sequence motifs, some of which are involved in human diseases (Wang Z. et al, 2019). Case studies of breast cancer, colorectal cancer, hepatocellular carcinoma, and acute myeloid leukemia indicate that MSFCNN could be an effective tool to infer potential circRNA–disease associations
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.