A representation learning model based on variational inference and graph autoencoder for predicting lncRNA-disease associations

Zhuangwei Shi,Xiongwen Quan,Yanbin Yin,Han Zhang,Chen Jin

doi:10.1186/s12859-021-04073-z

Zhuangwei Shi, Xiongwen Quan + Show 3 more

Open Access

https://doi.org/10.1186/s12859-021-04073-z

Copy DOI

Abstract

BackgroundNumerous studies have demonstrated that long non-coding RNAs are related to plenty of human diseases. Therefore, it is crucial to predict potential lncRNA-disease associations for disease prognosis, diagnosis and therapy. Dozens of machine learning and deep learning algorithms have been adopted to this problem, yet it is still challenging to learn efficient low-dimensional representations from high-dimensional features of lncRNAs and diseases to predict unknown lncRNA-disease associations accurately.ResultsWe proposed an end-to-end model, VGAELDA, which integrates variational inference and graph autoencoders for lncRNA-disease associations prediction. VGAELDA contains two kinds of graph autoencoders. Variational graph autoencoders (VGAE) infer representations from features of lncRNAs and diseases respectively, while graph autoencoders propagate labels via known lncRNA-disease associations. These two kinds of autoencoders are trained alternately by adopting variational expectation maximization algorithm. The integration of both the VGAE for graph representation learning, and the alternate training via variational inference, strengthens the capability of VGAELDA to capture efficient low-dimensional representations from high-dimensional features, and hence promotes the robustness and preciseness for predicting unknown lncRNA-disease associations. Further analysis illuminates that the designed co-training framework of lncRNA and disease for VGAELDA solves a geometric matrix completion problem for capturing efficient low-dimensional representations via a deep learning approach.ConclusionCross validations and numerical experiments illustrate that VGAELDA outperforms the current state-of-the-art methods in lncRNA-disease association prediction. Case studies indicate that VGAELDA is capable of detecting potential lncRNA-disease associations. The source code and data are available at https://github.com/zhanglabNKU/VGAELDA.

Highlights

Long non-encoding RNA (LncRNA) are RNAs longer than 200 nucleotides losing the function of encoding, while they can still influence a series of biological processes, such as gene transcription, cell apoptosis, hormonal regulation, and immune response
VGAELDA has the following advantages. (i) variational graph autoencoder (VGAE) is preferable to infer low-dimensional representations from high-dimensional features in a graph, and these representations can better depict similarities and dependencies among nodes. This would significantly enhance the robustness and preciseness of prediction without handcrafted feature similarities. (ii) VGAELDA implements the variational Expectation maximization (EM) algorithm as a representation learning framework, by training the feature inference autoencoder and the label propagation autoencoder alternately. (iii) VGAELDA provides a useful solution to the geometric matrix completion problem via deep learning, because autoencoders tend to minimize the rank of outputs, and we suggest that manifold regularization can be obtained via the alternate training of two graph autoencoders. (iv) VGAELDA implements an efficient way to integrate information from lncRNA space and disease space
Experiments illustrate that VGAELDA is superior to the current state-of-the-art methods, and case studies on several diseases illustrate the capability of VGAELDA to detect new lncRNA-disease associations

Summary

Introduction

LncRNAs are RNAs longer than 200 nucleotides losing the function of encoding, while they can still influence a series of biological processes, such as gene transcription, cell apoptosis, hormonal regulation, and immune response. LncRNAs are closely linked to plenty of human diseases [1,2,3]. It is essential to predict potential lncRNA-disease associations for disease prevention, detection, diagnosis and treatment. There are only a small number of lncRNA-disease associations that have been discovered so far, and it would be ideal to predict more potential lncRNA-disease associations using computational approaches. Computational methods, especially machine learning algorithms, are more time-efficient and cost-effective to detect potential lncRNA-disease associations compared with experimental methods. Dozens of machine learning and deep learning algorithms have been adopted to this problem, yet it is still challenging to learn efficient low-dimensional representations from high-dimensional features of lncRNAs and diseases to predict unknown lncRNA-disease associations accurately

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Mar 21, 2021
Citations: 60	License type: open-access

R Discovery Prime

R Discovery Prime

A representation learning model based on variational inference and graph autoencoder for predicting lncRNA-disease associations

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

VGAEDTI: drug-target interaction prediction based on variational inference and graph autoencoder
Yuanyuan Zhang ... Shudong Wang
BMC Bioinformatics | VOL. 24
Yuanyuan Zhang, et. al.Yuanyuan Zhang ... Shudong Wang
06 Jul 2023
BMC Bioinformatics | VOL. 24

Prediction of lncRNA-Disease Associations via Closest Node Weight Graphs of the Spatial Neighborhood Based on the Edge Attention Graph Convolutional Network.
Jianwei Li ... Mengfan Kong
Frontiers in genetics | VOL. 12
Jianwei Li, et. al.Jianwei Li ... Mengfan Kong
04 Jan 2022
Frontiers in genetics | VOL. 12

LDA-VGHB: identifying potential lncRNA-disease associations with singular value decomposition, variational graph auto-encoder and heterogeneous Newton boosting machine.
Lihong Peng ... Geng Tian
Briefings in Bioinformatics | VOL. 25
Lihong Peng, et. al.Lihong Peng ... Geng Tian
22 Nov 2023
Briefings in Bioinformatics | VOL. 25

Local2Global: Unsupervised multi-view deep graph representation learning with Nearest Neighbor Constraint
Xiaobo Zhang ... Hao Wang
Knowledge-Based Systems | VOL. 231
Xiaobo Zhang, et. al.Xiaobo Zhang ... Hao Wang
25 Aug 2021
Knowledge-Based Systems | VOL. 231

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A representation learning model based on variational inference and graph autoencoder for predicting lncRNA-disease associations

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics