Abstract
Protein–protein interactions are closely relevant to protein function and drug discovery. Hence, accurately identifying protein–protein interactions will help us to understand the underlying molecular mechanisms and significantly facilitate the drug discovery. However, the majority of existing computational methods for protein–protein interactions prediction are focused on the feature extraction and combination of features and there have been limited gains from the state-of-the-art models. In this work, a new residue representation method named Res2vec is designed for protein sequence representation. Residue representations obtained by Res2vec describe more precisely residue-residue interactions from raw sequence and supply more effective inputs for the downstream deep learning model. Combining effective feature embedding with powerful deep learning techniques, our method provides a general computational pipeline to infer protein–protein interactions, even when protein structure knowledge is entirely unknown. The proposed method DeepFE-PPI is evaluated on the S. Cerevisiae and human datasets. The experimental results show that DeepFE-PPI achieves 94.78% (accuracy), 92.99% (recall), 96.45% (precision), 89.62% (Matthew’s correlation coefficient, MCC) and 98.71% (accuracy), 98.54% (recall), 98.77% (precision), 97.43% (MCC), respectively. In addition, we also evaluate the performance of DeepFE-PPI on five independent species datasets and all the results are superior to the existing methods. The comparisons show that DeepFE-PPI is capable of predicting protein–protein interactions by a novel residue representation method and a deep learning classification framework in an acceptable level of accuracy. The codes along with instructions to reproduce this work are available from https://github.com/xal2019/DeepFE-PPI.
Highlights
Most biological processes within a cell are induced by a variety of interactions among the proteins and protein–protein interactions can form the basis for understanding protein functions, communications and regulations
Protein–protein interaction prediction is important for understanding the activity of complex cells from a molecular point of view
We propose a new protein sequence representation method combined with an effective deep learning framework to predict protein–protein interactions
Summary
Most biological processes within a cell are induced by a variety of interactions among the proteins and protein–protein interactions can form the basis for understanding protein functions, communications and regulations. Hosur et al (2012) introduced a structure-based algorithm by computing a single confidence score to infer thousands of binary protein interactions. A method combined three-dimensional structural information with other functional clues called PrePPI was proposed by Zhang et al (2012) to detect protein–protein interactions and it holds a superior accuracy and coverage. There exists template free-based protein–protein interaction prediction methods. MEGADOCK proposed by Ohue et al (2014) is a template free-based method. It was capable of exhaustive protein–protein interaction screening in less calculation time and an acceptable level of accuracy and obtained an F-measure value of 0.231 when it was applied to predict 120 relevant interacting pairs from 14,400 combinations of proteins. It could be used to search and analyze protein–protein interactions when taking into account three-dimensional
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.