Abstract

Determining the structure of the gene regulatory network using the information in genomewide profiles of mRNA abundance, such as microarray data, poses several challenges. Typically, "static" rather than dynamical profile measurements, such as those taken from steady state tissues in various conditions, are the starting point. This makes the inference of causal relationships between genes difficult. Moreover, the paucity of samples relative to the gene number leads to problems such as overfitting and underconstrained regression analysis. Here we present a novel method for the sparse approximation of gene regulatory networks that addresses these issues. It is formulated as a sparse combinatorial optimization problem which has a globally optimal solution in terms of l(0) norm error. In order to seek an approximate solution of the l(0) optimization problem, we consider a heuristic approach based on iterative greedy algorithms. We apply our method to a set of gene expression profiles comprising of 24,102 genes measured over 79 human tissues. The inferred network is a signed directed graph, hence predicts causal relationships. It exhibits typical characteristics of regulatory networks organism with partially known network topology, such as the average number of inputs per gene as well as the in-degree and out-degree distribution.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call