Abstract

The availability of an increased number of fully sequenced genomes demands functional interpretation of the genomic information. Despite high throughput experimental techniques and in silico methods of predicting protein-protein interaction (PPI); the interactome of most organisms is far from completion. Thus, predicting the interactome of an organism is one of the major challenges in the post-genomic era. This manuscript describes Support Vector Machine (SVM) based models that have been developed for discriminating interacting and non-interacting pairs of proteins from their amino acid sequence. We have developed SVM models using various types of sequence compositions e.g. amino acid, dipeptide, biochemical property, split amino acid and pseudo amino acid composition. We also developed SVM models using evolutionary information in the form of Position Specific Scoring Matrix (PSSM) composition. We achieved maximum Matthews's correlation coefficient (MCC) of 1.00, 0.52 and 0.74 for Escherichia coli, Saccharomyces cerevisiae, and Helicobacter pylori, using dipeptide based SVM model at default threshold. It was observed that the performance of a prediction model depends on the dataset used for training and testing. In case of E. coli MCC decreased from 1.0 to 0.67 when evaluated on a new dataset. In order to understand PPI in different cellular environment, we developed species-specific and general models. It was observed that species-specific models are more accurate than general models. We conclude that the primary amino acid sequence based descriptors could be used to differentiate interacting from non-interacting protein pairs. Some amino acids tend to be favored in interacting pairs than non-interacting ones. Finally, a web server has been developed for predicting protein-protein interactions.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.