Abstract

Studies have found that long non-coding RNAs (lncRNAs) play important roles in many human biological processes, and it is critical to explore potential lncRNA–disease associations, especially cancer-associated lncRNAs. However, traditional biological experiments are costly and time-consuming, so it is of great significance to develop effective computational models. We developed a random walk algorithm with restart on multiplex and heterogeneous networks of lncRNAs and diseases to predict lncRNA–disease associations (MHRWRLDA). First, multiple disease similarity networks are constructed by using different approaches to calculate similarity scores between diseases, and multiple lncRNA similarity networks are also constructed by using different approaches to calculate similarity scores between lncRNAs. Then, a multiplex and heterogeneous network was constructed by integrating multiple disease similarity networks and multiple lncRNA similarity networks with the lncRNA–disease associations, and a random walk with restart on the multiplex and heterogeneous network was performed to predict lncRNA–disease associations. The results of Leave-One-Out cross-validation (LOOCV) showed that the value of Area under the curve (AUC) was 0.68736, which was improved compared with the classical algorithm in recent years. Finally, we confirmed a few novel predicted lncRNAs associated with specific diseases like colon cancer by literature mining. In summary, MHRWRLDA contributes to predict lncRNA–disease associations.

Highlights

  • Numerous studies have indicated that protein-coding genes accounted for less than 2% of the human genome (Crick et al, 1961; Yanofsky, 2007)

  • The establishment of an effective computational model to predict the association between long non-coding RNAs (lncRNAs) and diseases can save time and money spent in biological experiments (Yao et al, 2019; Yan et al, 2020)

  • To evaluate the performance of MHRWRLDA, the receiver operating characteristic (ROC) curve was drawn by calculating TPR and FPR according to different thresholds

Read more

Summary

Introduction

Numerous studies have indicated that protein-coding genes accounted for less than 2% of the human genome (Crick et al, 1961; Yanofsky, 2007). There are many non-translatable RNAs called non-coding RNAs (ncRNAs), which have been considered as transcriptional noise for a long time (Zhang et al, 2017; Xu et al, 2020). Long non-coding RNAs (lncRNAs) whose length are greater than 200 nucleotides are a class of important ncRNAs (Mercer et al, 2009). There are increasing evidence that lncRNAs play key roles in many important biological processes and MHRWRLDA diseases (Akerman et al, 2017; Wang et al, 2019; Peng et al, 2020). LncRNAs associated with tumor immune invasion in non-small cell lung cancer (NSCLC) have important value in improving clinical efficacy and immunotherapy, compared with normal controls, and the expression of gabpb1-it was significantly downregulated in NSCLC. The establishment of an effective computational model to predict the association between lncRNAs and diseases can save time and money spent in biological experiments (Yao et al, 2019; Yan et al, 2020)

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.