Abstract

Found in recent research, tumor cell invasion, proliferation, or other biological processes are controlled by circular RNA. Understanding the association between circRNAs and diseases is an important way to explore the pathogenesis of complex diseases and promote disease-targeted therapy. Most methods, such as k-mer and PSSM, based on the analysis of high-throughput expression data have the tendency to think functionally similar nucleic acid lack direct linear homology regardless of positional information and only quantify nonlinear sequence relationships. However, in many complex diseases, the sequence nonlinear relationship between the pathogenic nucleic acid and ordinary nucleic acid is not much different. Therefore, the analysis of positional information expression can help to predict the complex associations between circRNA and disease. To fill up this gap, we propose a new method, named iCDA-CGR, to predict the circRNA-disease associations. In particular, we introduce circRNA sequence information and quantifies the sequence nonlinear relationship of circRNA by Chaos Game Representation (CGR) technology based on the biological sequence position information for the first time in the circRNA-disease prediction model. In the cross-validation experiment, our method achieved 0.8533 AUC, which was significantly higher than other existing methods. In the validation of independent data sets including circ2Disease, circRNADisease and CRDD, the prediction accuracy of iCDA-CGR reached 95.18%, 90.64% and 95.89%. Moreover, in the case studies, 19 of the top 30 circRNA-disease associations predicted by iCDA-CGR on circRDisease dataset were confirmed by newly published literature. These results demonstrated that iCDA-CGR has outstanding robustness and stability, and can provide highly credible candidates for biological experiments.

Highlights

  • Circular RNA is a type of non-coding RNA without 5’ end caps or a 3’ end poly (A) tails [1]

  • Understanding the association between circular RNA (circRNA) and diseases is an important step to explore the pathogenesis of complex diseases and promote disease-targeted therapy

  • The location information of circRNA sequences was first introduced into the circRNA-disease associations prediction model

Read more

Summary

Introduction

Circular RNA (circRNA) is a type of non-coding RNA without 5’ end caps or a 3’ end poly (A) tails [1]. Since the discovery of circular RNA (circRNA) in RNA viruses 40 years ago, more than 100,000 circRNAs have been found in cells [2]. With the rapid development of RNA sequencing (RNA-seq) technology and bioinformatics, more and more studies have shown that circRNA plays an important role in many cell activities including effecting on arteriosclerosis, involving in the regulation of mRNA expression and regulating alternative splicing [3,4,5,6,7,8]. Zhou et al found miR-141 is suppressed by circRNA_010567 through targeting TGF-beta to promote myocardial fibrosis[9]. The high experimental cost and long experimental circle restrict the traditional experimental methods from verifying the association between circRNA and diseases on a large scale. In order to solve this problem, the calculation method rises in response to the proper time and conditions[12,13,14,15,16]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call