Abstract

Recent studies suggest that circRNA is closely related to the occurrence and development of human diseases, and it has great application prospects in the field of disease diagnostic markers. However, restricted by the environment and conditions, it is usually time-consuming and labor-intensive to use biological experimental methods to identify the association between circRNA and disease. In this study, we propose a novel computational framework NMFCDA that combines randomization-based neural network Pseudoinverse Learning (PIL) with Non-Negative Matrix Factorization (NMF) to predict circRNA-disease associations. The model first fuses circRNA natural language sequence information, disease semantic information, and circRNA and disease Gaussian interaction profile (GIP) kernel similarity information into a unified matrix, then uses NMF algorithm to obtain its key features, and finally uses randomization-based PIL to search for the global optimal solution to accurately predict the association between circRNA and disease. In the benchmark data set circR2Disease, NMFCDA achieved a prediction accuracy of 92.56% and an AUC of 0.9278, significantly higher than other classifier models and previous existing methods. Furthermore, 26 of the top 30 disease-associated circRNAs with the highest predictive scores were confirmed by the relevant literature. These results indicate that NMFCDA can be used as a useful prediction tool to provide theoretical basis and reliable circRNA candidates for biological experiments.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call