Process-Driven Inference of Biological Network Structure: Feasibility, Minimality, and Multiplicity

Guanyu Wang,Carl Pearson,Rahul Simha,Chenghang Du,Hao Chen,Chen Zeng,Yongwu Rong,Frank Emmert-Streib

doi:10.1371/journal.pone.0040330

Guanyu Wang, Carl Pearson + Show 6 more

Open Access

https://doi.org/10.1371/journal.pone.0040330

Copy DOI

Abstract

A common problem in molecular biology is to use experimental data, such as microarray data, to infer knowledge about the structure of interactions between important molecules in subsystems of the cell. By approximating the state of each molecule as “on” or “off”, it becomes possible to simplify the problem, and exploit the tools of Boolean analysis for such inference. Amongst Boolean techniques, the process-driven approach has shown promise in being able to identify putative network structures, as well as stability and modularity properties. This paper examines the process-driven approach more formally, and makes four contributions about the computational complexity of the inference problem, under the “dominant inhibition” assumption of molecular interactions. The first is a proof that the feasibility problem (does there exist a network that explains the data?) can be solved in polynomial-time. Second, the minimality problem (what is the smallest network that explains the data?) is shown to be NP-hard, and therefore unlikely to result in a polynomial-time algorithm. Third, a simple polynomial-time heuristic is shown to produce near-minimal solutions, as demonstrated by simulation. Fourth, the theoretical framework explains how multiplicity (the number of network solutions to realize a given biological process), which can take exponential-time to compute, can instead be accurately estimated by a fast, polynomial-time heuristic.

Highlights

A central theme in molecular biology is to understand the complex network of interactions between biomolecules and how those interactions contribute to higher-level biological function [1,2,3,4,5,6]
Boolean Network Model The starting point for our model is a collection of N interacting molecules, each of which at any given time is modeled as either ‘‘on’’ or ‘‘off.’’ Let si(t)[f0,1g denote the state of molecule i and S(t)~(s1(t), . . . ,sN (t)) the state of the system at time t
We show how to convert the particular Conjunctive Normal Form (CNF) formula for our Boolean network into a HoÈrn formula

Summary

Introduction

A central theme in molecular biology is to understand the complex network of interactions between biomolecules and how those interactions contribute to higher-level biological function [1,2,3,4,5,6]. Many important measures and concepts have been defined to characterize network structures, such as degree distribution [12], clustering coefficient [7], the Estrada index [8], entropy-based molecular descriptors [13,14,15,16], and network motif [17]. These structural approaches have been used in studying biological networks. The implications of scale-freeness on the robustness and evolvability of genetic regulatory networks have been studied in [19]

Objectives

Results

Conclusion