Abstract

Various types of biological knowledge describe networks of interactions among elementary entities. For example, transcriptional regulatory networks consist of interactions among proteins and genes. Current knowledge about the exact structure of such networks is highly incomplete, and laboratory experiments that manipulate the entities involved are conducted to test hypotheses about these networks. In recent years, various automated approaches to experiment selection have been proposed. Many of these approaches can be characterized as active machine learning algorithms. Active learning is an iterative process in which a model is learned from data, hypotheses are generated from the model to propose informative experiments, and the experiments yield new data that is used to update the model. This review describes the various models, experiment selection strategies, validation techniques, and successful applications described in the literature; highlights common themes and notable distinctions among methods; and identifies likely directions of future research and open problems in the area.

Highlights

  • Biological processes are complex, involving many interactions among elementary units

  • The mathematical notation that refers to entities that appear in most methods is kept as consistent as possible throughout this review: m 2 M represent models; observed states are represented by x or its bold and uppercase variants, depending on whether states are seen as atomic set elements, vectors, or random variables within the context of a particular method; and e or its bold and uppercase variants represent experiments

  • The general problem setup considered by Atias et al [6] is similar to that considered by Ideker et al [1]: the given data consists of Boolean vectors representing gene expression profiles under a variety of conditions, the underlying model to be learned is a Boolean network, and the aim of the experiment selection strategy is to arrive at a single model after as few experiments as possible

Read more

Summary

OPEN ACCESS

Various types of biological knowledge describe networks of interactions among elementary entities. Transcriptional regulatory networks consist of interactions among proteins and genes. Current knowledge about the exact structure of such networks is highly incomplete, and laboratory experiments that manipulate the entities involved are conducted to test hypotheses about these networks. Various automated approaches to experiment selection have been proposed. Many of these approaches can be characterized as active machine learning algorithms. Active learning is an iterative process in which a model is learned from data, hypotheses are generated from the model to propose informative experiments, and the experiments yield new data that is used to update the model.

Introduction
Mathematical notation
Active learning de novo
Boolean networks
Causal Bayesian networks
Xn X nq
Probabilistic temporal Boolean network
Differential equation models
Purely structural model
Active learning with prior knowledge
Validation and refinement of gene regulatory models in yeast
Adam the Robot Scientist
Ce þ
Regulatory mechanisms downstream of a signaling pathway
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call