Abstract
Identification of protein-protein interactions (PPIs) plays an essential role in the understanding of protein functions and cellular biological activities. However, the traditional experiment-based methods are time-consuming and laborious. Therefore, developing new reliable computational approaches has great practical significance for the identification of PPIs. In this paper, a novel prediction method is proposed for predicting PPIs using graph energy, named PPI-GE. Particularly, in the process of feature extraction, we designed two new feature extraction methods, the physicochemical graph energy based on the ionization equilibrium constant and isoelectric point and the contact graph energy based on the contact information of amino acids. The dipeptide composition method was used for order information of amino acids. After multi-information fusion, principal component analysis (PCA) was implemented for eliminating noise and a robust weighted sparse representation-based classification (WSRC) classifier was applied for sample classification. The prediction accuracies based on the five-fold cross-validation of the human, Helicobacter pylori (H. pylori), and yeast data sets were 99.49%, 97.15%, and 99.56%, respectively. In addition, in five independent data sets and two significant PPI networks, the comparative experimental results also demonstrate that PPI-GE obtained better performance than the compared methods.
Highlights
Protein-protein interaction (PPI) plays a distinctly important role in understanding cellular biological activities [1]
To ensure the reliability of experimental results and avoid over fitting of data, we implemented five-fold cross-validation to evaluate the effectiveness of PPI-GE and other computational models
We introduce graph energy to encode protein sequences and present a novel prediction method called PPI-GE for predicting PPIs using amino acid sequences alone
Summary
Protein-protein interaction (PPI) plays a distinctly important role in understanding cellular biological activities [1]. Its research contributes to understanding the protein function, mechanism of biological activity, disease diagnosis and prevention, and new drug development [2,3,4]. The research methods of PPI can be divided into two types: computational and experimental methods. Over the past few decades, many innovative experimental technologies have been designed to attempt to validate. PPIs, such as glutathione S-transferase [5], protein chip [6], yeast two-hybrid [7], tandem affinity purification (TAP) tag [8], and other high-throughput technologies. Developing new reliable computational approaches has great practical significance for PPI identification at low cost and high efficiency [10]
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have