Abstract

Protein–protein interactions (PPIs) are essential for most biological processes. However, current PPI networks present high levels of noise, sparseness and incompleteness, which limits our ability to understand the cell at the system level from the PPI network. Predicting novel (missing) links in noisy PPI networks is an essential computational method for automatically expanding the human interactome and for identifying biologically legitimate but undetected interactions for experimental determination of PPIs, which is both expensive and time-consuming. Recently, graph convolutional networks (GCN) have shown their effectiveness in modeling graph-structured data, which employ a 1-hop neighborhood aggregation procedure and have emerged as a powerful architecture for node or graph representations. In this paper, we propose a novel node (protein) embedding method by combining GCN and PageRank as the latter can significantly improve the GCN’s aggregation scheme, which has difficulty in extending and exploring topological information of networks across higher-order neighborhoods of each node. Building on this novel node embedding model, we develop a higher-order GCN variational auto-encoder (HO-VGAE) architecture, which can learn a joint node representation of higher-order local and global PPI network topology for novel protein interaction prediction. It is worth noting that our method is based exclusively on network topology, with no protein attributes or extra biological features used. Extensive computational validations on PPI prediction task demonstrate our method without leveraging any additional biological information shows competitive performance—outperforms all existing graph embedding-based link prediction methods in both accuracy and robustness.

Highlights

  • Protein-protein interactions (PPIs) are crucial in almost every process in a cell

  • We propose a novel node embedding method by combining graph convolutional networks (GCN) and PageRank as the latter can significantly improve the GCN’s aggregation scheme, which has difficulty in extending and exploring topological information of networks across higherorder neighborhoods of each node. Building on this novel node embedding model, we develop an adaption of variational graph auto-encoder (VGAE) [29], called HO-VGAE, for novel PPI prediction, which aims to explore only network topology, no protein attributes or extra biological information used in PPI networks

  • We present a graph embedding-based computational method that can effectively predict missing links in noisy and incomplete PPI networks, with no additional biological information involved

Read more

Summary

Introduction

Protein-protein interactions (PPIs) are crucial in almost every process in a cell. Understanding PPIs is essential to identify cell physiology states that are normal or diseased. Knowledge of PPIs can significantly facilitate uncharacterized protein function prediction and drug design [1, 2]. We usually represent the totality of PPIs in a cell or an organism with a PPI network.

Objectives
Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.