Abstract

BackgroundMany aspects of biological functions can be modeled by biological networks, such as protein interaction networks, metabolic networks, and gene coexpression networks. Studying the statistical properties of these networks in turn allows us to infer biological function. Complex statistical network models can potentially more accurately describe the networks, but it is not clear whether such complex models are better suited to find biologically meaningful subnetworks.ResultsRecent studies have shown that the degree distribution of the nodes is not an adequate statistic in many molecular networks. We sought to extend this statistic with 2nd and 3rd order degree correlations and developed a pseudo-likelihood approach to estimate the parameters. The approach was used to analyze the MIPS and BIOGRID yeast protein interaction networks, and two yeast coexpression networks. We showed that 2nd order degree correlation information gave better predictions of gene interactions in both protein interaction and gene coexpression networks. However, in the biologically important task of predicting functionally homogeneous modules, degree correlation information performs marginally better in the case of the MIPS and BIOGRID protein interaction networks, but worse in the case of gene coexpression networks.ConclusionOur use of dK models showed that incorporation of degree correlations could increase predictive power in some contexts, albeit sometimes marginally, but, in all contexts, the use of third-order degree correlations decreased accuracy. However, it is possible that other parameter estimation methods, such as maximum likelihood, will show the usefulness of incorporating 2nd and 3rd degree correlations in predicting functionally homogeneous modules.

Highlights

  • Many aspects of biological functions can be modeled by biological networks, such as protein interaction networks, metabolic networks, and gene coexpression networks

  • We constructed the network by calculating the Pearson correlation coefficient (PCC) between the expression levels of every pair of genes

  • We studied the ability of dK distribution models to predict individual edges and functionally homogeneous modules in protein interaction and gene coexpression networks

Read more

Summary

Introduction

Many aspects of biological functions can be modeled by biological networks, such as protein interaction networks, metabolic networks, and gene coexpression networks. Complex statistical network models can potentially more accurately describe the networks, but it is not clear whether such complex models are better suited to find biologically meaningful subnetworks. High throughput technologies such as microarrays and yeast-two-hybrid assays have resulted in an explosion of biological data that can be represented as networks. BMC Bioinformatics 2009, 10:277 http://www.biomedcentral.com/1471-2105/10/277 summarized as a network, with nodes representing proteins and edges representing physical interaction between two proteins. Genes and their products give rise to biological function through their interaction with each other and with other components of the cell. We need a thorough understanding of the statistical properties of biological networks if we aim to make inferences, such as inferring evolutionary relationships between various networks, or separating signal from noise in imperfect network data

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call