Abstract

Numerous evidences indicate that Circular RNAs (circRNAs) are widely involved in the occurrence and development of diseases. Identifying the association between circRNAs and diseases plays a crucial role in exploring the pathogenesis of complex diseases and improving the diagnosis and treatment of diseases. However, due to the complex mechanisms between circRNAs and diseases, it is expensive and time-consuming to discover the new circRNA-disease associations by biological experiment. Therefore, there is increasingly urgent need for utilizing the computational methods to predict novel circRNA-disease associations. In this study, we propose a computational method called GCNCDA based on the deep learning Fast learning with Graph Convolutional Networks (FastGCN) algorithm to predict the potential disease-associated circRNAs. Specifically, the method first forms the unified descriptor by fusing disease semantic similarity information, disease and circRNA Gaussian Interaction Profile (GIP) kernel similarity information based on known circRNA-disease associations. The FastGCN algorithm is then used to objectively extract the high-level features contained in the fusion descriptor. Finally, the new circRNA-disease associations are accurately predicted by the Forest by Penalizing Attributes (Forest PA) classifier. The 5-fold cross-validation experiment of GCNCDA achieved 91.2% accuracy with 92.78% sensitivity at the AUC of 90.90% on circR2Disease benchmark dataset. In comparison with different classifier models, feature extraction models and other state-of-the-art methods, GCNCDA shows strong competitiveness. Furthermore, we conducted case study experiments on diseases including breast cancer, glioma and colorectal cancer. The results showed that 16, 15 and 17 of the top 20 candidate circRNAs with the highest prediction scores were respectively confirmed by relevant literature and databases. These results suggest that GCNCDA can effectively predict potential circRNA-disease associations and provide highly credible candidates for biological experiments.

Highlights

  • As a new type of endogenous non-coding RNA, circular RNA has a closed-loop structure without a 5’and 3’polyadenylated tails [1,2,3]

  • 16, 15 and 17 of the top 20 candidate circular RNA (circRNA) with the highest prediction scores in disease including breast cancer, glioma and colorectal cancer were respectively confirmed by relevant literature and databases

  • General evaluation criteria are used in this study to evaluate the performance of GCNCDA, including accuracy (Accu.), Sensitivity (Sen.), precision (Prec.), F1-Score (F1) and Matthews Correlation Coefficient (MCC)

Read more

Summary

Introduction

As a new type of endogenous non-coding RNA, circular RNA (circRNA) has a closed-loop structure without a 5’and 3’polyadenylated tails [1,2,3]. As early as 1971, researchers discovered the viroids genome composed of single-stranded closed RNA molecules in potatoes [4]. In 1995, the researchers [6] found that the mouse sperm determinant gene Sry has circular transcription during transcription. These findings did not attract much attention of researchers at the time. Until 2012, Salzman et al [7] reported about 80 circRNAs for the first time with the help of high-throughput sequencing technology. A large number of circRNA molecules have been identified

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.