Using multi-step proposal distribution for improved MCMC convergence in Bayesian network structure learning.

Antti Larjo,Harri Lähdesmäki

doi:10.1186/s13637-015-0024-7

Abstract

Bayesian networks have become popular for modeling probabilistic relationships between entities. As their structure can also be given a causal interpretation about the studied system, they can be used to learn, for example, regulatory relationships of genes or proteins in biological networks and pathways. Inference of the Bayesian network structure is complicated by the size of the model structure space, necessitating the use of optimization methods or sampling techniques, such Markov Chain Monte Carlo (MCMC) methods. However, convergence of MCMC chains is in many cases slow and can become even a harder issue as the dataset size grows. We show here how to improve convergence in the Bayesian network structure space by using an adjustable proposal distribution with the possibility to propose a wide range of steps in the structure space, and demonstrate improved network structure inference by analyzing phosphoprotein data from the human primary T cell signaling network.

Highlights

Probabilistic graphical models are a class of models often used in various application fields
Given a set of random variables X = {X1, ..., Xn}, a Bayesian network is defined as a pair (G, θ ), where G is a directed acyclic graph (DAG) whose n nodes represent the variables in X and edges give a graphical representation of the conditional independencies between these variables so that each node Xi is conditionally independent of its nondescendants given its parents in G
Bayesian networks (BNs) can be used to model probability distributions that respect the directed factorization property, i.e., the distribution factorizes according to the DAG

Summary

Introduction

Probabilistic graphical models are a class of models often used in various application fields. Factors complicating BN structure learning include superexponentially growing structure-space size as the number of nodes increases This prohibits exhaustive evaluation for most practical applications and instead forces to utilize heuristic search techniques, such as hill climbing, which can suffer from finding mostly only local maxima, or more preferably sophisticated sampling methods, like Markov Chain Monte Carlo (MCMC) methods. Other notable improvements in BN structure learning include the work in [15], where dynamic programming is utilized to calculate the posterior probabilities of all BNs in exponential time, and variations of this [16]. MCMC in the space of DAGs is more challenging than in continuous space because, e.g., the exceedingly large discrete search space and the acyclicity constraint, which make the search-space exploration computationally demanding and all but the simplest proposal distributions more difficult to define. The proposed method often decreases computational load by enabling the chains to escape peaks of local maxima much more efficiently

Background

Conclusions