Abstract

Large-scale molecular interaction data sets have the potential to provide a comprehensive, system-wide understanding of biological function. Although individual molecules can be promiscuous in terms of their contribution to function, molecular functions emerge from the specific interactions of molecules giving rise to modular organisation. As functions often derive from a range of mechanisms, we demonstrate that they are best studied using networks derived from different sources. Implementing a graph partitioning algorithm we identify subnetworks in yeast protein-protein interaction (PPI), genetic interaction and gene co-regulation networks. Among these subnetworks we identify cohesive subgraphs that we expect to represent functional modules in the different data types. We demonstrate significant overlap between the subgraphs generated from the different data types and show these overlaps can represent related functions as represented by the Gene Ontology (GO). Next, we investigate the correspondence between our subgraphs and the Gene Ontology. This revealed varying degrees of coverage of the biological process, molecular function and cellular component ontologies, dependent on the data type. For example, subgraphs from the PPI show enrichment for 84%, 58% and 93% of annotated GO terms, respectively. Integrating the interaction data into a combined network increases the coverage of GO. Furthermore, the different annotation types of GO are not predominantly associated with one of the interaction data types. Collectively our results demonstrate that successful capture of functional relationships by network data depends on both the specific biological function being characterised and the type of network data being used. We identify functions that require integrated information to be accurately represented, demonstrating the limitations of individual data types. Combining interaction subnetworks across data types is therefore essential for fully understanding the complex and emergent nature of biological function.

Highlights

  • Computational analysis of large-scale data sets is undoubtedly revealing an increasingly complete functional map of the cell [1]

  • Cerevisiae data: a protein-protein interaction (PPI) network consisting of 12,182 interactions between 3,339 genes, a genetic interaction network consisting of 42,546 interactions between 3,529 genes, a co-regulation network consisting of 3,006,725 weighted interactions between 4,358 genes, and a combined network consisting of 3,052,053 unique weighted edges between 5,489 genes (Files S1–S4)

  • We find that some subgraphs are the best hits of many other subgraphs and these appear to capture Gene Ontology (GO) functions very accurately (Figure 2)

Read more

Summary

Introduction

Computational analysis of large-scale data sets is undoubtedly revealing an increasingly complete functional map of the cell [1]. Functional modules and subnetworks are assumed to be one and the same, for example, a range of graph-property based approaches have been developed that identify subnetworks in protein-protein interaction [4], metabolomic [5], gene expression [6] and genetic interaction data sets [7]. These analyses potentially lead to an incomplete picture of function, since function usually arises from the coordinated and highly-specific operation of molecules of different types

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.