Identification of functional modules using network topology and high-throughput data

Igor Ulitsky,Ron Shamir

doi:10.1186/1752-0509-1-8

Igor Ulitsky, Ron Shamir

Open Access

https://doi.org/10.1186/1752-0509-1-8

Copy DOI

Journal: BMC systems biology	Publication Date: Jan 26, 2007
Citations: 339	License type: CC BY 2.0

Affiliation: Tel Aviv University

Abstract

BackgroundWith the advent of systems biology, biological knowledge is often represented today by networks. These include regulatory and metabolic networks, protein-protein interaction networks, and many others. At the same time, high-throughput genomics and proteomics techniques generate very large data sets, which require sophisticated computational analysis. Usually, separate and different analysis methodologies are applied to each of the two data types. An integrated investigation of network and high-throughput information together can improve the quality of the analysis by accounting simultaneously for topological network properties alongside intrinsic features of the high-throughput data.ResultsWe describe a novel algorithmic framework for this challenge. We first transform the high-throughput data into similarity values, (e.g., by computing pairwise similarity of gene expression patterns from microarray data). Then, given a network of genes or proteins and similarity values between some of them, we seek connected sub-networks (or modules) that manifest high similarity. We develop algorithms for this problem and evaluate their performance on the osmotic shock response network in S. cerevisiae and on the human cell cycle network. We demonstrate that focused, biologically meaningful and relevant functional modules are obtained. In comparison with extant algorithms, our approach has higher sensitivity and higher specificity.ConclusionWe have demonstrated that our method can accurately identify functional modules. Hence, it carries the promise to be highly useful in analysis of high throughput data.

Highlights

With the advent of systems biology, biological knowledge is often represented today by networks
We develop a novel computational method for efficient detection and analysis of Jointly Active Connected Subnetwork (JACS), implemented in a program called MATISSE (Module Analysis via Topology of Interactions and Similarity SEts)
A JACS aims to capture a set of genes that have highly similar behavior, and are topologically connected, and may share a common function, e.g., belong to a single complex or pathway

Summary

Introduction

With the advent of systems biology, biological knowledge is often represented today by networks These include regulatory and metabolic networks, protein-protein interaction networks, and many others. The accumulation of large-scale interaction data on multiple organisms, such as protein-protein and protein-DNA interactions, requires novel computational techniques that will be able to analyze these data together with information collected through other means Such methods should enable thorough dissection of the data, whose dimensions have already extended far beyond the scope that is amenable to traditional analysis and manual interpretation. An important class of such biological information can be represented in the form of similarity relations Quantitative molecular data, such as mRNA expression profiles, are often analyzed in this context through clustering algorithms. Initial works [12,13] proposed measures for scoring expression activity in metabolic pathways (e.g. KEGG database [14]) and complexes [15]

Objectives

Methods

Results

Conclusion