Abstract

Accumulating evidences have indicated that lncRNAs play an important role in various human complex diseases. However, known disease-related lncRNAs are still comparatively small in number, and experimental identification is time-consuming and labor-intensive. Therefore, developing a useful computational method for inferring potential associations between lncRNAs and diseases has become a hot topic, which can significantly help people to explore complex human diseases at the molecular level and effectively advance the quality of disease diagnostics, therapy, prognosis and prevention. In this paper, we propose a novel prediction of lncRNA-disease associations via lncRNA-disease-gene tripartite graph (TPGLDA), which integrates gene-disease associations with lncRNA-disease associations. Compared to previous studies, TPGLDA can be used to better delineate the heterogeneity of coding-non-coding genes-disease association and can effectively identify potential lncRNA-disease associations. After implementing the leave-one-out cross validation, TPGLDA achieves an AUC value of 93.9% which demonstrates its good predictive performance. Moreover, the top 5 predicted rankings of lung cancer, hepatocellular carcinoma and ovarian cancer are manually confirmed by different relevant databases and literatures, affording convincing evidence of the good performance as well as potential value of TPGLDA in identifying potential lncRNA-disease associations. Matlab and R codes of TPGLDA can be found at following: https://github.com/USTC-HIlab/TPGLDA.

Highlights

  • Long non-coding RNAs are a new class of transcripts, with the length longer than 200nt[1,2,3], which have been implicated in a number of normal physiological processes at every stage of life, from embryonic development and cellular cell fate determination to physiological homoeostasis of entire organisms[4]

  • Inferring potential associations between Long non-coding RNAs (lncRNAs) and diseases can help us understand the pathogenesis of complex diseases at the molecular level and benefit biomarker identification for disease diagnosis, therapy, prognosis and monitoring[5]

  • Despite the success achieved by aforementioned methods, another important factor contributing to infer potential lncRNA-disease associations lies in the fact that coding and non-coding genes are often cooperated in human diseases, which has been demonstrated in many previous studies[19,20,21,22] For example, Sahu et al.[23] demonstrate that coding gene-TAF1D and lncRNA-SNHG1 are highly co-expressed in neuroblastoma

Read more

Summary

Results

Note that Yang’s method[22] is not assessed here as it requires the node degree of each candidate ≥2 As a result, both LRLSLDA and KRWRH achieve reliable performance with AUC values of 82.2% and 83.8%, respectively, and TPGLDA has improved with an AUC value of 93.9%. We report the corresponding recall rate (Fig. 4), which measures the number of known lncRNA-disease association pairs that can be correctly identified within the top k candidates of ranking lists[31,32]. We further report the Friedman rank sum test on our dataset to show the statistical significance in performance improvement of TPGLDA (Supplementary Table S6) These examinations demonstrate that TPGLDA has practical ability to predict various potential lncRNA-disease associations. Considering the fact that the number of disease-related genes is an order of LncRNA

BANCR unconfirmed
Materials and Methods
Additional Information
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call