Abstract

BackgroundProtein-protein interaction has been used to complement traditional sequence homology to elucidate protein function. Most existing approaches only make use of direct interactions to infer function, and some have studied the application of indirect interactions for functional inference but are unable to improve prediction performance. We have previously proposed an approach, FS-Weighted Averaging, which uses topological weighting and level-2 indirect interactions (protein pairs connected via two interactions) for predicting protein function from protein interactions and have found that it yields predictions with superior precision on yeast proteins over existing approaches. Here we study the use of this technique to predict functional annotations from the Gene Ontology for seven genomes: Saccharomyces cerevisiae, Drosophila melanogaster, Caenorhabditis elegans, Arabidopsis thaliana, Rattus norvegicus, Mus musculus, and Homo sapiens.ResultsOur analysis shows that protein-protein interactions provide supplementary coverage over sequence homology in the inference of protein function and is definitely a complement to sequence homology. We also find that FS-Weighted Averaging consistently outperforms two classical approaches, Neighbor Counting and Chi-Square, across the seven genomes for all three categories of the Gene Ontology. By randomly adding and removing interactions from the interactions, we find that Weighted Averaging is also rather robust against noisy interaction data.ConclusionWe have conducted a comprehensive study over seven genomes. We conclude that FS-Weighted Averaging can effectively make use of indirect interactions to make the inference of protein functions from protein interactions more effective. Furthermore, the technique is general enough to work over a variety of genomes.

Highlights

  • Protein-protein interaction has been used to complement traditional sequence homology to elucidate protein function

  • Coverage of protein-protein interactions To appreciate the feasibility of protein-protein interactions for function discovery, we want to find out whether protein-protein interactions provide any additional coverage over sequence homology

  • We look at two well-studied genomes, S. cerevisiae and D. melanogaster, and examine: 1) how many known functions can be inferred from other proteins with sequence similarity in the genome; 2) how many more functions can be suggested from interaction partners on top of (1); and 3) how many more functions can be suggested from indirect interaction partners on top of (1) and (2)

Read more

Summary

Introduction

Protein-protein interaction has been used to complement traditional sequence homology to elucidate protein function. A simple but effective approach is to assign a protein with the function that occurs most frequently in its interaction partners [10] This is further improved in [11], which predicts function based on chi-square statistics instead of frequency. Others apply global optimization techniques, such as Markov random fields [15,16] and simulated annealing [17], to propagate predictions so that the function of proteins without characterized neighbors can be predicted. Most of these approaches use the observation that a protein often shares functions with proteins that interact with it (i.e., its level neighbors). Less than 2% of the annotated proteins share functions exclusively with level-1 neighbors

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call