Abstract

BackgroundProtein-protein interactions (PPIs) are crucial in cellular processes. Since the current biological experimental techniques are time-consuming and expensive, and the results suffer from the problems of incompleteness and noise, developing computational methods and software tools to predict PPIs is necessary. Although several approaches have been proposed, the species supported are often limited and additional data like homologous interactions in other species, protein sequence and protein expression are often required. And predictive abilities of different features for different kinds of PPI data have not been studied.ResultsIn this paper, we propose ppiPre, an open-source framework for PPI analysis and prediction using a combination of heterogeneous features including three GO-based semantic similarities, one KEGG-based co-pathway similarity and three topology-based similarities. It supports up to twenty species. Only the original PPI data and gold-standard PPI data are required from users. The experiments on binary and co-complex gold-standard yeast PPI data sets show that there exist big differences among the predictive abilities of different features on different kinds of PPI data sets. And the prediction performance on the two data sets shows that ppiPre is capable of handling PPI data in different kinds and sizes. ppiPre is implemented in the R language and is freely available on the CRAN (http://cran.r-project.org/web/packages/ppiPre/).ConclusionsWe applied our framework to both binary and co-complex gold-standard PPI data sets. The detailed analysis on three GO aspects suggests that different GO aspects should be used on different kinds of data sets, and that combining all the three aspects of GO often gets the best result. The analysis also shows that using only features based solely on the topology of the PPI network can get a very good result when predicting the co-complex PPI data. ppiPre provides useful functions for analysing PPI data and can be used to predict PPIs for multiple species.

Highlights

  • Protein-protein interactions (PPIs) are crucial in cellular processes

  • Heterogeneous features are integrated in the prediction framework of ppiPre, including three Gene Ontology (GO)-based semantic similarities, one KEGG-based similarity indicating the proteins are involved in the same pathways and three topology-based similarities using only the network structure of the PPI network

  • An open-source framework ppiPre for PPI prediction is proposed in this paper

Read more

Summary

Introduction

Protein-protein interactions (PPIs) are crucial in cellular processes. Since the current biological experimental techniques are time-consuming and expensive, and the results suffer from the problems of incompleteness and noise, developing computational methods and software tools to predict PPIs is necessary. Several approaches have been proposed, the species supported are often limited and additional data like homologous interactions in other species, protein sequence and protein expression are often required. Existing studies are mainly differing in the selection of features used in the prediction framework. In these studies, different biological evidences are extracted and used as features training the classifier, including Gene Ontology (GO) functional annotations [8,9], protein sequences [10] and co-expressed proteins [11]. Most of the frameworks only support a few well studied model organisms These frameworks often need users to provide additional biological data along with the PPIs. different species often require different features, which make these existing frameworks not very convenient to use

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.