Abstract

BackgroundProtein-protein interactions play essential roles in protein function determination and drug design. Numerous methods have been proposed to recognize their interaction sites, however, only a small proportion of protein complexes have been successfully resolved due to the high cost. Therefore, it is important to improve the performance for predicting protein interaction sites based on primary sequence alone.ResultsWe propose a new idea to construct an integrative profile for each residue in a protein by combining its hydrophobic and evolutionary information. A support vector machine (SVM) ensemble is then developed, where SVMs train on different pairs of positive (interface sites) and negative (non-interface sites) subsets. The subsets having roughly the same sizes are grouped in the order of accessible surface area change before and after complexation. A self-organizing map (SOM) technique is applied to group similar input vectors to make more accurate the identification of interface residues. An ensemble of ten-SVMs achieves an MCC improvement by around 8% and F1 improvement by around 9% over that of three-SVMs. As expected, SVM ensembles constantly perform better than individual SVMs. In addition, the model by the integrative profiles outperforms that based on the sequence profile or the hydropathy scale alone. As our method uses a small number of features to encode the input vectors, our model is simpler, faster and more accurate than the existing methods.ConclusionsThe integrative profile by combining hydrophobic and evolutionary information contributes most to the protein-protein interaction prediction. Results show that evolutionary context of residue with respect to hydrophobicity makes better the identification of protein interface residues. In addition, the ensemble of SVM classifiers improves the prediction performance.AvailabilityDatasets and software are available at http://mail.ustc.edu.cn/~bigeagle/BMCBioinfo2010/index.htm.

Highlights

  • Protein-protein interactions play essential roles in protein function determination and drug design

  • The pioneering work by Kini and Evans addressed the issue of protein interaction site prediction by a unique predictive method based on the observation that “proline” is the most common residue found in the flanking segments of interaction sites [6]

  • To compare with the current state of the art of protein-protein interaction prediction, we tested on the same dataset and adopted the same definition of interface residues as literature [31], where the dataset was from literature [15]

Read more

Summary

Introduction

Protein-protein interactions play essential roles in protein function determination and drug design. Proteins interact with other proteins in order to perform specific biological functions, such as signal transduction or immunological recognition, DNA replication and gene translation, as well as protein synthesis [1]. These interactions are localized to the so-called “interaction sites” or “interface residues”. Identification of these residues will allow us to understand how proteins recognize other molecules and to gain clues into their possible functions at the level of the cell and at the organism. Other works have addressed various aspects of protein structure and behavior, such as detecting patch analysis [9], solventaccessible surface area buried upon association [10], free energy changes upon alanine-scanning mutations [11], in silico two hybrid systems [12], sequence or structure conservation information [13,14,15,16,17], and sequence hydrophobicity distribution [18]

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.