Abstract

BackgroundPresent protein interaction network data sets include only interactions among subsets of the proteins in an organism. Previously this has been ignored, but in principle any global network analysis that only looks at partial data may be biased. Here we demonstrate the need to consider network sampling properties explicitly and from the outset in any analysis.ResultsHere we study how properties of the yeast protein interaction network are affected by random and non-random sampling schemes using a range of different network statistics. Effects are shown to be independent of the inherent noise in protein interaction data. The effects of the incomplete nature of network data become very noticeable, especially for so-called network motifs. We also consider the effect of incomplete network data on functional and evolutionary inferences.ConclusionCrucially, when only small, partial network data sets are considered, bias is virtually inevitable. Given the scope of effects considered here, previous analyses may have to be carefully reassessed: ignoring the fact that present network data are incomplete will severely affect our ability to understand biological systems.

Highlights

  • Present protein interaction network data sets include only interactions among subsets of the proteins in an organism

  • For important model organisms such as Saccharomyces cerevisiae, Caenorhabditis elegans and Drosophila melanogaster there are extensive protein interaction data deposited in public-domain databases and serious attempts are being made at elucidating the human protein interaction network (PIN) [2,3]

  • Considerable effort has been invested in understanding, for example, the functional organization and evolutionary properties of PINs, and contradictory results have been reported in the literature which are probably affected by many factors in addition to incomplete data

Read more

Summary

Introduction

Present protein interaction network data sets include only interactions among subsets of the proteins in an organism. For important model organisms such as Saccharomyces cerevisiae, Caenorhabditis elegans and Drosophila melanogaster there are extensive protein interaction data deposited in public-domain databases and serious attempts are being made at elucidating the human protein interaction network (PIN) [2,3]. These network data sets – extensive though they are thanks to experimental advances and in silico prediction – do not cover the entire network. Despite the noise in the present yeast PIN, the S. cerevisiae data will give us a more realistic representation of a true PIN than theoretical network models

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.