Abstract

BackgroundThousands of biological and biomedical investigators study of the functional role of single genes and their protein products in normal physiology and in disease. The findings from these studies are reported in research articles that stimulate new research. It is now established that a complex regulatory networks's is controlling human cellular fate, and this community of researchers are continually unraveling this network topology. Attempts to integrate results from such accumulated knowledge resulted in literature-based protein-protein interaction networks (PPINs) and pathway databases. These databases are widely used by the community to analyze new data collected from emerging genome-wide studies with the assumption that the data within these literature-based databases is the ground truth and contain no biases. While suspicion for research focus biases is growing, a concrete proof for it is still missing. It is difficult to prove because the real PPINs are mostly unknown.ResultsHere we analyzed the longitudinal discovery process of literature-based mammalian and yeast PPINs to observe that these networks are discovered non-uniformly. The pattern of discovery is related to a theoretical concept proposed by Kauffman called “expanding the adjacent possible”. We introduce a network discovery model which explicitly includes the space of possibilities in the form of a true underlying PPIN.ConclusionsOur model strongly suggests that research focus biases exist in the observed discovery dynamics of these networks. In summary, more care should be placed when using PPIN databases for analysis of newly acquired data, and when considering prior knowledge when designing new experiments.Electronic supplementary materialThe online version of this article (doi:10.1186/s12918-015-0173-z) contains supplementary material, which is available to authorized users.

Highlights

  • Thousands of biological and biomedical investigators study of the functional role of single genes and their protein products in normal physiology and in disease

  • In order to eliminate extrinsic factors, such as the changing pace of scientific discovery, while retaining the intrinsic properties of the protein-protein interaction networks (PPINs) discovery process, we converted the real-time discovery of each proteinprotein interactions (PPIs) to a time-ranked order

  • We begin with a random uniform exploration process, and by modulating the probability of discovering links based on the already discovered network, we study the effect research focus biases can have on the dynamics of the network discovery process

Read more

Summary

Introduction

Thousands of biological and biomedical investigators study of the functional role of single genes and their protein products in normal physiology and in disease. The findings from these studies are reported in research articles that stimulate new research. Attempts to integrate results from such accumulated knowledge resulted in literature-based protein-protein interaction networks (PPINs) and pathway databases. These databases are widely used by the community to analyze new data collected from emerging genome-wide studies with the assumption that the data within these literature-based databases is the ground truth and contain no biases.

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call