Abstract

BackgroundThe existing studies show that circRNAs can be used as a biomarker of diseases and play a prominent role in the treatment and diagnosis of diseases. However, the relationships between the vast majority of circRNAs and diseases are still unclear, and more experiments are needed to study the mechanism of circRNAs. Nowadays, some scholars use the attributes between circRNAs and diseases to study and predict their associations. Nonetheless, most of the existing experimental methods use less information about the attributes of circRNAs, which has a certain impact on the accuracy of the final prediction results. On the other hand, some scholars also apply experimental methods to predict the associations between circRNAs and diseases. But such methods are usually expensive and time-consuming. Based on the above shortcomings, follow-up research is needed to propose a more efficient calculation-based method to predict the associations between circRNAs and diseases.ResultsIn this study, a novel algorithm (method) is proposed, which is based on the Graph Convolutional Network (GCN) constructed with Random Walk with Restart (RWR) and Principal Component Analysis (PCA) to predict the associations between circRNAs and diseases (CRPGCN). In the construction of CRPGCN, the RWR algorithm is used to improve the similarity associations of the computed nodes with their neighbours. After that, the PCA method is used to dimensionality reduction and extract features, it makes the connection between circRNAs with higher similarity and diseases closer. Finally, The GCN algorithm is used to learn the features between circRNAs and diseases and calculate the final similarity scores, and the learning datas are constructed from the adjacency matrix, similarity matrix and feature matrix as a heterogeneous adjacency matrix and a heterogeneous feature matrix.ConclusionsAfter 2-fold cross-validation, 5-fold cross-validation and 10-fold cross-validation, the area under the ROC curve of the CRPGCN is 0.9490, 0.9720 and 0.9722, respectively. The CRPGCN method has a valuable effect in predict the associations between circRNAs and diseases.

Highlights

  • The existing studies show that Circular RNA (circRNA) can be used as a biomarker of diseases and play a prominent role in the treatment and diagnosis of diseases

  • Evaluation method and metrices The Receiver operating characteristic (ROC) curve is drawn based on True positive rate (TPR) and False positive rate (FPR)

  • The experiment used a variety of methods to assess performance, including recall (Recall), F1 score (F1), accuracy (ACC), Matthew correlation coefficient (MCC), area under the receiver operating characteristic curve (AUC) and area under precision-recall curve (AUPR)

Read more

Summary

Introduction

The existing studies show that circRNAs can be used as a biomarker of diseases and play a prominent role in the treatment and diagnosis of diseases. The relationships between the vast majority of circRNAs and diseases are still unclear, and more experiments are needed to study the mechanism of circRNAs. Nowadays, some scholars use the attributes between circRNAs and diseases to study and predict their associations. Some scholars apply experimental methods to predict the associations between circRNAs and diseases. Such methods are usually expensive and time-consuming. Based on the above shortcomings, follow-up research is needed to propose a more efficient calculation-based method to predict the associations between circRNAs and diseases. It is more necessary to study the relationships between circRNAs and diseases based on computational methods

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call