Abstract

BackgroundNumerous studies on discovering the roles of long non-coding RNAs (lncRNAs) in the occurrence, development and prognosis progresses of various human diseases have drawn substantial attentions. Since only a tiny portion of lncRNA-disease associations have been properly annotated, an increasing number of computational methods have been proposed for predicting potential lncRNA-disease associations. However, traditional predicting models lack the ability to precisely extract features of biomolecules, it is urgent to find a model which can identify potential lncRNA-disease associations with both efficiency and accuracy.ResultsIn this study, we proposed a novel model, SVDNVLDA, which gained the linear and non-linear features of lncRNAs and diseases with Singular Value Decomposition (SVD) and node2vec methods respectively. The integrated features were constructed from connecting the linear and non-linear features of each entity, which could effectively enhance the semantics contained in ultimate representations. And an XGBoost classifier was employed for identifying potential lncRNA-disease associations eventually.ConclusionsWe propose a novel model to predict lncRNA-disease associations. This model is expected to identify potential relationships between lncRNAs and diseases and further explore the disease mechanisms at the lncRNA molecular level.

Highlights

  • Numerous studies on discovering the roles of long non-coding RNAs in the occurrence, development and prognosis progresses of various human diseases have drawn substantial attentions

  • We propose an integrated feature extraction model, Singular Value Decomposition SVD and Node2Vec based LncRNA-Disease Association prediction model (SVDNVLDA), to predict potential long non-coding RNAs (lncRNAs)-disease associations

  • Classifier selection and parameter tuning After gaining the linear feature matrixes U and V T based on SVD, we found a huge decay gap from 10−1 to 10−14 between the 173rd and the 174th dimensions of the importance matrix (Additional file 1)

Read more

Summary

Introduction

Numerous studies on discovering the roles of long non-coding RNAs (lncRNAs) in the occurrence, development and prognosis progresses of various human diseases have drawn substantial attentions. Traditional predicting models lack the ability to precisely extract features of biomolecules, it is urgent to find a model which can identify potential lncRNA-disease associations with both efficiency and accuracy. Most non-coding genes would be transcribed into noncoding RNAs (ncRNAs). As their names imply, ncRNAs cannot be directly translated into proteins, so they were often considered as the "noise" of genome transcription without any biological functions for decades. MALAT1, known as NEAT2, was found upregulated in non-small cell lung cancer tissues and could be served as an early prognostic biomarker [15]; lncRNA HOTAIR had been explored as a potential biomarker on the detection of hepatocellular carcinoma relapse [16]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call