Abstract

BackgroundA large number of experimental studies show that the mutation and regulation of long non-coding RNAs (lncRNAs) are associated with various human diseases. Accurate prediction of lncRNA-disease associations can provide a new perspective for the diagnosis and treatment of diseases. The main function of many lncRNAs is still unclear and using traditional experiments to detect lncRNA-disease associations is time-consuming.ResultsIn this paper, we develop a novel and effective method for the prediction of lncRNA-disease associations using network feature similarity and gradient boosting (LDNFSGB). In LDNFSGB, we first construct a comprehensive feature vector to effectively extract the global and local information of lncRNAs and diseases through considering the disease semantic similarity (DISSS), the lncRNA function similarity (LNCFS), the lncRNA Gaussian interaction profile kernel similarity (LNCGS), the disease Gaussian interaction profile kernel similarity (DISGS), and the lncRNA-disease interaction (LNCDIS). Particularly, two methods are used to calculate the DISSS (LNCFS) for considering the local and global information of disease semantics (lncRNA functions) respectively. An autoencoder is then used to reduce the dimensionality of the feature vector to obtain the optimal feature parameter from the original feature set. Furthermore, we employ the gradient boosting algorithm to obtain the lncRNA-disease association prediction.ConclusionsIn this study, hold-out, leave-one-out cross-validation, and ten-fold cross-validation methods are implemented on three publicly available datasets to evaluate the performance of LDNFSGB. Extensive experiments show that LDNFSGB dramatically outperforms other state-of-the-art methods. The case studies on six diseases, including cancers and non-cancers, further demonstrate the effectiveness of our method in real-world applications.

Highlights

  • A large number of experimental studies show that the mutation and regulation of long non-coding RNAs are associated with various human diseases

  • Cumulative evidence shows that only ∼2 percent of protein-coding genes are in the human genome and the remaining ∼98 percent of the human genome are classified as non-coding RNAs [1]

  • (1) In order to construct the best similarity features, we implement a comparative experiment on the LncRNADisease dataset based on different features and compare and analyze the experimental results of LDNFSGB under different feature vectors

Read more

Summary

Introduction

A large number of experimental studies show that the mutation and regulation of long non-coding RNAs (lncRNAs) are associated with various human diseases. Long non-coding RNAs (lncRNAs) play an increasingly important role in some fundamental biological processes such as translational regulation, cell cycle regulation, epigenetic regulation, splicing, differentiation, and immune response [5]. The loss of HOTAIR can inhibit cancer invasion, especially in cells with excessive PRC2 activity. These findings suggest that lncRNAs have a positive role in regulating the epigenome of cancer and may be an important target for cancer diagnosis and treatment [10]. It is essential to propose an effective and efficient computational model for predicting lncRNA-disease associations [12, 15]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call