Abstract

BackgroundA central problem in systems biology research is the identification and extension of biological modules–groups of genes or proteins participating in a common cellular process or physical complex. As a result, there is a persistent need for practical, principled methods to infer the modular organization of genes from genome-scale data.ResultsWe introduce a novel approach for the identification of modules based on the persistence of isolated gene groups within an evolving graph process. First, the underlying genomic data is summarized in the form of ranked gene–gene relationships, thereby accommodating studies that quantify the relevant biological relationship directly or indirectly. Then, the observed gene–gene relationship ranks are viewed as the outcome of a random graph process and candidate modules are given by the identifiable subgraphs that arise during this process. An isolation index is computed for each module, which quantifies the statistical significance of its survival time.ConclusionsThe Miso (module isolation) method predicts gene modules from genomic data and the associated isolation index provides a module-specific measure of confidence. Improving on existing alternative, such as graph clustering and the global pruning of dendrograms, this index offers two intuitively appealing features: (1) the score is module-specific; and (2) different choices of threshold correlate logically with the resulting performance, i.e. a stringent cutoff yields high quality predictions, but low sensitivity. Through the analysis of yeast phenotype data, the Miso method is shown to outperform existing alternatives, in terms of the specificity and sensitivity of its predictions.

Highlights

  • Much of systems biology research aims to identify biologically meaningful relationships between genes or their products, such as protein-protein interactions or co-membership in a biological pathway

  • Dissimilar biological modules in relational data We assume that genomics data arrives in the form of ranked pairwise relationship scores

  • To make our Miso procedure robust to this sort of error, we extend it by considering the impact of high-leverage edges, i.e. the ‘between’ edges whose placement cause the death of a candidate module

Read more

Summary

Introduction

Much of systems biology research aims to identify biologically meaningful relationships between genes or their products, such as protein-protein interactions or co-membership in a biological pathway This undertaking can be viewed as moving from the ‘‘parts lists’’ produced by genome sequencing projects to the assembly instructions for a complex system. A common assumption made in the analysis of networks is the existence of biologically defined subnetworks commonly referred to as modules Examples of such a module are a protein complex or a gene expression regulon. Correlation of expression levels or, more relevant to this study, loss of function phenotypes across multiple conditions provides an indirect measure of gene–gene relationship Other assays such as yeast two-hybrid or genetic interaction screens using double knockouts provide direct measures of these relationships. There is a persistent need for practical, principled methods to infer the modular organization of genes from genome-scale data

Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.