Edge2vec: Representation learning using edge semantics for biomedical knowledge discovery

Zheng Gao,David Wild,Jeremy Yang,Xiaozhong Liu,Christopher Gessner,Satoshi Tsutsui,Gang Fu,Qi Yu,Brian Foote,Ying Ding,Chunping Ouyang

doi:10.1186/s12859-019-2914-2

Abstract

BackgroundRepresentation learning provides new and powerful graph analytical approaches and tools for the highly valued data science challenge of mining knowledge graphs. Since previous graph analytical methods have mostly focused on homogeneous graphs, an important current challenge is extending this methodology for richly heterogeneous graphs and knowledge domains. The biomedical sciences are such a domain, reflecting the complexity of biology, with entities such as genes, proteins, drugs, diseases, and phenotypes, and relationships such as gene co-expression, biochemical regulation, and biomolecular inhibition or activation. Therefore, the semantics of edges and nodes are critical for representation learning and knowledge discovery in real world biomedical problems.ResultsIn this paper, we propose the edge2vec model, which represents graphs considering edge semantics. An edge-type transition matrix is trained by an Expectation-Maximization approach, and a stochastic gradient descent model is employed to learn node embedding on a heterogeneous graph via the trained transition matrix. edge2vec is validated on three biomedical domain tasks: biomedical entity classification, compound-gene bioactivity prediction, and biomedical information retrieval. Results show that by considering edge-types into node embedding learning in heterogeneous graphs, edge2vec significantly outperforms state-of-the-art models on all three tasks.ConclusionsWe propose this method for its added value relative to existing graph analytical methodology, and in the real world context of biomedical knowledge discovery applicability.

Highlights

Representation learning provides new and powerful graph analytical approaches and tools for the highly valued data science challenge of mining knowledge graphs
These models were designed for homogeneous networks, which means that they do not explicitly encode information related to the types of nodes and edges in a heterogeneous network
We develop an EM model to train a transition matrix via random walks on a heterogeneous graph as a unified framework and employ a stochastic gradient descent (SGD) method to learn node embedding in an efficient manner

Summary

Introduction

Representation learning provides new and powerful graph analytical approaches and tools for the highly valued data science challenge of mining knowledge graphs. This approach has several drawbacks: 1) domain knowledge is required to define metapaths and those mentioned in [7] are symmetric paths which are unrealistic in many applications; 2) metapath2vec does not consider edge types rather only node types; and 3) metapath2vec can only consider one metapath at one time to generate random walk, it cannot consider all the metapaths at the same time during random walk On another related track, which might be termed biomedical data science (BMDS), previous work has employed KG embedding and ML methodology with the focus on applicability and applications such as compound target bioactivity [8, 9] and disease-associated gene prioritization [10].

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Jun 10, 2019
Citations: 43	License type: open-access

R Discovery Prime

R Discovery Prime

Edge2vec: Representation learning using edge semantics for biomedical knowledge discovery

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

HGATE: Heterogeneous Graph Attention Auto-Encoders
Wei Wang ... Hong-Ning Dai
IEEE Transactions on Knowledge and Data Engineering | VOL. 35
Wei Wang, et. al.Wei Wang ... Hong-Ning Dai
01 Apr 2023
IEEE Transactions on Knowledge and Data Engineering | VOL. 35

Graph Learning-Based Blockchain Phishing Account Detection with a Heterogeneous Transaction Graph
Jaehyeon Kim ... Yushin Kim
Sensors | VOL. 23
Jaehyeon Kim, et. al.Jaehyeon Kim ... Yushin Kim
01 Jan 2023
Sensors | VOL. 23

MBHAN: Motif-Based Heterogeneous Graph Attention Network
Qian Hu ... Weiping Lin
Applied Sciences | VOL. 12
Qian Hu, et. al.Qian Hu ... Weiping Lin
10 Jun 2022
Applied Sciences | VOL. 12

Semi-Supervised Heterogeneous Graph Learning with Multi-Level Data Augmentation
Ying Chen ... Shaoshuai Li
ACM Transactions on Knowledge Discovery from Data | VOL. 18
Ying Chen, et. al.Ying Chen ... Shaoshuai Li
14 Nov 2023
ACM Transactions on Knowledge Discovery from Data | VOL. 18

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Edge2vec: Representation learning using edge semantics for biomedical knowledge discovery

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics