Abstract
Various types of biological knowledge describe networks of interactions among elementary entities. For example, transcriptional regulatory networks consist of interactions among proteins and genes. Current knowledge about the exact structure of such networks is highly incomplete, and laboratory experiments that manipulate the entities involved are conducted to test hypotheses about these networks. In recent years, various automated approaches to experiment selection have been proposed. Many of these approaches can be characterized as active machine learning algorithms. Active learning is an iterative process in which a model is learned from data, hypotheses are generated from the model to propose informative experiments, and the experiments yield new data that is used to update the model. This review describes the various models, experiment selection strategies, validation techniques, and successful applications described in the literature; highlights common themes and notable distinctions among methods; and identifies likely directions of future research and open problems in the area.
Highlights
Biological processes are complex, involving many interactions among elementary units
The mathematical notation that refers to entities that appear in most methods is kept as consistent as possible throughout this review: m 2 M represent models; observed states are represented by x or its bold and uppercase variants, depending on whether states are seen as atomic set elements, vectors, or random variables within the context of a particular method; and e or its bold and uppercase variants represent experiments
The general problem setup considered by Atias et al [6] is similar to that considered by Ideker et al [1]: the given data consists of Boolean vectors representing gene expression profiles under a variety of conditions, the underlying model to be learned is a Boolean network, and the aim of the experiment selection strategy is to arrive at a single model after as few experiments as possible
Summary
Various types of biological knowledge describe networks of interactions among elementary entities. Transcriptional regulatory networks consist of interactions among proteins and genes. Current knowledge about the exact structure of such networks is highly incomplete, and laboratory experiments that manipulate the entities involved are conducted to test hypotheses about these networks. Various automated approaches to experiment selection have been proposed. Many of these approaches can be characterized as active machine learning algorithms. Active learning is an iterative process in which a model is learned from data, hypotheses are generated from the model to propose informative experiments, and the experiments yield new data that is used to update the model.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have