Abstract

Protein–protein interactions (PPI) are key to protein functions and regulations within the cell cycle, DNA replication, and cellular signaling. Therefore, detecting whether a pair of proteins interact is of great importance for the study of molecular biology. As researchers have become aware of the importance of computational methods in predicting PPIs, many techniques have been developed for performing this task computationally. However, there are few technologies that really meet the needs of their users. In this paper, we develop a novel and efficient sequence-based method for predicting PPIs. The evolutionary features are extracted from the position-specific scoring matrix (PSSM) of protein. The features are then fed into a robust relevance vector machine (RVM) classifier to distinguish between the interacting and non-interacting protein pairs. In order to verify the performance of our method, five-fold cross-validation tests are performed on the Saccharomyces cerevisiae dataset. A high accuracy of 94.56%, with 94.79% sensitivity at 94.36% precision, was obtained. The experimental results illustrated that the proposed approach can extract the most significant features from each protein sequence and can be a bright and meaningful tool for the research of proteomics.

Highlights

  • Protein–protein interactions (PPI) are a key step in the realization of protein function within cell cycle progression, DNA replication, and signal transmission [1,2,3]

  • PPI datasets have been stored in a number of constructed databases, such as the Molecular Interaction database (MINT), the Database of Interacting Proteins (DIP), and the Biomolecular Interaction Network Database (BIND) [8,9,10]

  • In the face of such difficulties, how do we design a way to use the evolution information of proteins to implement the prediction of PPIs efficiently? In order to overcome this problem, we proposed a novel scheme that uses a position-specific scoring matrix (PSSM) to translate the protein sequence into a matrix, in which both the evolutionary information and the amino acid composition are included

Read more

Summary

Introduction

Protein–protein interactions (PPI) are a key step in the realization of protein function within cell cycle progression, DNA replication, and signal transmission [1,2,3]. PPI datasets have been stored in a number of constructed databases, such as the Molecular Interaction database (MINT), the Database of Interacting Proteins (DIP), and the Biomolecular Interaction Network Database (BIND) [8,9,10]. The number of PPIs that are validated by these methods represents only a small portion of the entire PPI network. The experimental methods are usually associated with a high rate of both false negative and false positive predictions. All of these drawbacks encourage further research into a computational approach for identifying PPIs

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.