Abstract

Long non-coding RNAs (long ncRNAs, lncRNAs) of all kinds have been implicated in a range of cell developmental processes and diseases, while they are not translated into proteins. Inferring diseases associated lncRNAs by computational methods can be helpful to understand the pathogenesis of diseases, but those current computational methods still have not achieved remarkable predictive performance: such as the inaccurate construction of similarity networks and inadequate numbers of known lncRNA–disease associations. In this research, we proposed a lncRNA–disease associations inference based on integrated space projection scores (LDAI-ISPS) composed of the following key steps: changing the Boolean network of known lncRNA–disease associations into the weighted networks via combining all the global information (e.g., disease semantic similarities, lncRNA functional similarities, and known lncRNA–disease associations); obtaining the space projection scores via vector projections of the weighted networks to form the final prediction scores without biases. The leave-one-out cross validation (LOOCV) results showed that, compared with other methods, LDAI-ISPS had a higher accuracy with area-under-the-curve (AUC) value of 0.9154 for inferring diseases, with AUC value of 0.8865 for inferring new lncRNAs (whose associations related to diseases are unknown), with AUC value of 0.7518 for inferring isolated diseases (whose associations related to lncRNAs are unknown). A case study also confirmed the predictive performance of LDAI-ISPS as a helper for traditional biological experiments in inferring the potential LncRNA–disease associations and isolated diseases.

Highlights

  • Long non-coding RNAs (LncRNAs) are a type of RNA, defined as being transcripts with lengths exceeding 200 nucleotides that are not translated into protein, which exist in all kinds of organisms widely [1,2]

  • Considering above limitations, we proposed a novel lncRNA–disease associations inference based on space projections of integrated networks (LDAI-ISPS) that contained the following four steps: step one, reconstruct the disease integrated similarities network via integrating multiple network information; step two, change the Boolean network of known experimentally verified associations into the weighted network for further inferring the associations between lncRNAs and diseases accurately; step three, utilize the vector projections of the vectors coming from the networks of the above two steps to construct space projection scores; step four, obtain the final prediction results by integrating two kinds of space projection sores

  • The prediction of lncRNA–disease associations is helpful to explore the complex pathogenesis of diseases; on the other hand, the traditional biological methods are tedious and time-consuming, many computational methods have emerged in recent years used for inferring massive lncRNA–disease associations

Read more

Summary

Introduction

Long non-coding RNAs (LncRNAs) are a type of RNA, defined as being transcripts with lengths exceeding 200 nucleotides that are not translated into protein, which exist in all kinds of organisms widely [1,2]. Lan et al [31] used bagging support vector machine (SVM) to predict lncRNA–disease association based on multiple biological data resources fused by the matrix geometric mean. Selecting the unknown associations as the negative samples randomly is unreasonable and will undoubtedly have a serious impact on the accuracy of prediction results, because it cannot mean that these associations do not exist. For overcoming that the negative sample cannot be obtained accurately, Chen et al [32] proposed a semi-supervised learning framework (LRLSLDA) based on Laplacian Regularized Least Squares, without needing negative samples. Considering that the accurate similarity network construction is beneficial to improve the prediction accuracy, Chen et al [33] used the hyper geometric distribution to infer lncRNA–disease associations (HGLDA) without relying on known experimentally verified lncRNA–disease associations, but HGLDA cannot be used for isolated diseases and new lncRNAs

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call