Abstract

With the development of bio-medical big data, the prediction of protein-protein interactions (PPIs) with the help of deep learning (DL) has attracted much attention for the study of intermolecular mechanism, drug design, human disease treatment. Given that the experiment-based methods can be difficult, reliable and DL-based approaches are needed. In this paper, we develop the EResCNN, an effective predictor to predict PPIs based on ensemble residual convolutional neural network. First, the fused feature representation is captured by concatenating the vectors obtained via pseudo amino acid composition (PseAAC), auto covariance descriptor (AC), pseudo position-specific scoring matrix (PsePSSM), encoding based on grouped weight (EBGW), multivariate mutual information (MMI) and conjoint triad (CT). Then the high-level information can be obtained using the convolution and pooling of the residual convolutional neural network (RCNN) via layer-by-layer learning. At last, we ensemble RCNN, XGBoost, random forest, LightGBM and extremely randomized trees to build the EResCNN model. The predictive results indicate that EResCNN achieves better performance with the ACC values of 95.34%, 87.89% and 98.61% on S. cerevisiae, H. pylori and Human-Y. pestis datasets, respectively. We also apply EResCNN to the datasets of H. sapiens, M. musculus, C. elegans and E. coli for cross-species prediction. Especially, we find EResCNN can infer significant PPIs network on one-core network, Wnt-related signal pathway network, cancer-specific network and multi-core network, which could provide some references for signal pathway research, disease-related gene mining, and interaction network topology.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call