Completing sparse and disconnected protein-protein network by deep learning

Lei Huang,Li Liao,Cathy H Wu

doi:10.1186/s12859-018-2112-7

Lei Huang, Li Liao + Show 1 more

Open Access

https://doi.org/10.1186/s12859-018-2112-7

Copy DOI

Journal: BMC Bioinformatics	Publication Date: Mar 22, 2018
Citations: 19	License type: open-access

Affiliation: University of Delaware

Abstract

BackgroundProtein-protein interaction (PPI) prediction remains a central task in systems biology to achieve a better and holistic understanding of cellular and intracellular processes. Recently, an increasing number of computational methods have shifted from pair-wise prediction to network level prediction. Many of the existing network level methods predict PPIs under the assumption that the training network should be connected. However, this assumption greatly affects the prediction power and limits the application area because the current golden standard PPI networks are usually very sparse and disconnected. Therefore, how to effectively predict PPIs based on a training network that is sparse and disconnected remains a challenge.ResultsIn this work, we developed a novel PPI prediction method based on deep learning neural network and regularized Laplacian kernel. We use a neural network with an autoencoder-like architecture to implicitly simulate the evolutionary processes of a PPI network. Neurons of the output layer correspond to proteins and are labeled with values (1 for interaction and 0 for otherwise) from the adjacency matrix of a sparse disconnected training PPI network. Unlike autoencoder, neurons at the input layer are given all zero input, reflecting an assumption of no a priori knowledge about PPIs, and hidden layers of smaller sizes mimic ancient interactome at different times during evolution. After the training step, an evolved PPI network whose rows are outputs of the neural network can be obtained. We then predict PPIs by applying the regularized Laplacian kernel to the transition matrix that is built upon the evolved PPI network. The results from cross-validation experiments show that the PPI prediction accuracies for yeast data and human data measured as AUC are increased by up to 8.4 and 14.9% respectively, as compared to the baseline. Moreover, the evolved PPI network can also help us leverage complementary information from the disconnected training network and multiple heterogeneous data sources. Tested by the yeast data with six heterogeneous feature kernels, the results show our method can further improve the prediction performance by up to 2%, which is very close to an upper bound that is obtained by an Approximate Bayesian Computation based sampling method.ConclusionsThe proposed evolution deep neural network, coupled with regularized Laplacian kernel, is an effective tool in completing sparse and disconnected PPI networks and in facilitating integration of heterogeneous data sources.

Highlights

Protein-protein interaction (PPI) prediction remains a central task in systems biology to achieve a better and holistic understanding of cellular and intracellular processes
In this work we developed a novel method based on deep learning neural network and regularized Laplacian kernel to predict de novo interactions for sparse and disconnected PPI networks
We built the neural network with a typical auto-encoder structure to implicitly simulate the evolutionary processes of PPI networks

Summary

Introduction

Protein-protein interaction (PPI) prediction remains a central task in systems biology to achieve a better and holistic understanding of cellular and intracellular processes. To circumvent limitations of pair-wise biological similarity, network structure based methods are playing an increasing role in PPI prediction since these methods can get the whole network structure involved and topological similarities implicitly included, and utilize pair-wise biological similarities as weights for the edges in the networks Along this line, variants of random walk [12,13,14,15] have been developed. Huang et al proposed a sampling method [21] and a linear programming method [22] to find optimal weights for multiple heterogeneous data, thereby building weighted kernel fusion for all node pairs These methods applied regularized Laplacian kernel (RL) to the weighted kernel fusion to infer missing or new edges in the PPI network. These methods improved PPI prediction performance, especially for detecting interactions between nodes that are far apart in the training network, by using only small training networks

Methods

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Completing sparse and disconnected protein-protein network by deep learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Protein-protein interaction prediction based on multiple kernels and partial network with linear programming
Lei Huang ... Cathy H Wu
BMC Systems Biology | VOL. 10
Lei Huang, et. al.Lei Huang ... Cathy H Wu
01 Aug 2016
BMC Systems Biology | VOL. 10

Predicting Therapeutic Targets with Integration of Heterogeneous Data Sources
Yan-Fen Dai ... Xing-Ming Zhao
-
Yan-Fen Dai, et. al.Yan-Fen Dai ... Xing-Ming Zhao
01 Jan 2013
01 Jan 2013

An approach for semantic integration of heterogeneous data sources.
Giuseppe Fusco ... Lerina Aversano
PeerJ Computer Science | VOL. 6
Giuseppe Fusco, et. al.Giuseppe Fusco ... Lerina Aversano
02 Mar 2020
PeerJ Computer Science | VOL. 6

Protein-protein interaction network inference from multiple kernels with optimization based on random walk by linear programming
Lei Huang ... Li Liao
-
Lei Huang, et. al.Lei Huang ... Li Liao
01 Nov 2015
01 Nov 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Completing sparse and disconnected protein-protein network by deep learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics