Abstract

Graph centralities are commonly used to identify and prioritize disease genes in transcriptional regulatory networks. Studies on small networks of experimentally validated protein-protein interactions underpin the general validity of this approach and extensions of such findings have recently been proposed for networks inferred from gene expression data. However, it is largely unknown how well gene centralities are preserved between the underlying biological interactions and the networks inferred from gene expression data. Specifically, while previous studies have evaluated the performance of inference methods on synthetic gene expression, it has not been established how the choice of inference method affects individual centralities in the network. Here, we compare two gene centrality measures between reference networks and networks inferred from corresponding simulated gene expression data, using a number of commonly used network inference methods. The results indicate that the centrality of genes is only moderately conserved for all of the inference methods used. In conclusion, caution should be exercised when inspecting centralities in reverse-engineered networks and further work will be required to establish the use of such networks for prioritizing disease genes.

Highlights

  • With the increasing amount of ‘omics’ data made available to researchers during the last decades, biological network analysis has rapidly grown in its importance as one of the predominant methods of studying the underlying interactions and relationships between biological entities (Zhu et al 2007)

  • Initial investigations of centralities in protein-protein interaction (PPI) networks of Saccharomyces cerevisiae, Drosophila melanogaster and Caenorhabditis elegans have suggested that developmentally and functionally essential proteins, i.e. proteins whose disruption leads to embryonal lethality, might be associated with high degree, closeness or betweenness centralities (Jeong et al 2001; Joy et al 2005; Hahn and Kern 2005; Estrada 2006a, b)

  • Genes with somatic mutations, as compared to non-essential disease genes, might still exhibit more central positions in such networks (Goh et al 2007), especially when further considering that many cancer genes are characterized by a gain rather than loss of function, and drive abnormal proliferation and growth programs that are essential for embryonal development

Read more

Summary

Introduction

With the increasing amount of ‘omics’ data made available to researchers during the last decades, biological network analysis has rapidly grown in its importance as one of the predominant methods of studying the underlying interactions and relationships between biological entities (Zhu et al 2007). Genes with somatic mutations, as compared to non-essential disease genes, might still exhibit more central positions in such networks (Goh et al 2007), especially when further considering that many cancer genes are characterized by a gain rather than loss of function, and drive abnormal proliferation and growth programs that are essential for embryonal development. Another concern with such early studies relates to the fact that they have mainly been performed on PPI networks built from databases of validated biological interactions. These benchmark datasets are employed to estimate the agreement of degree and betweenness centralities between the reference and inferred networks for the different inference methods

Generation of Benchmark Datasets
Inference of Transcriptional Networks
Estimating the Accuracy of Inferred Networks
Conservation of Centralities in Inferred Networks
Conclusion and Future Prospectives
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call