Exact Bayesian learning of Partition Directed Acyclic Graphs

J Pensar,J Corander,J Kohonen

doi:10.1088/1742-6596/1036/1/012017

Abstract

It is known that directed acyclic graphs (DAGs) may hide several local features of the joint probability distribution that can be essential for some applications. To remedy this, more expressive model classes have been introduced. In addition to the restrictions implied by conditional independence, these model classes typically include some form of local structure that implies equality constraints on the node-wise conditional distribution. In particular, the concept of context-specific independence (CSI) was introduced to increase the flexibility of traditional Bayesian networks. Furthermore, in the most expressive class of generalized Bayesian networks, decision graphs were used to model arbitrary parameter restrictions. Here we formulate an alternative representation of such models called a partition DAG (PDAG), which defines the parameter restrictions using a partition-based representation of the parent outcome spaces. We establish a criterion that can identify whether an arbitrary PDAG has a CSI-consistent representation using an efficient basic graph theoretic algorithm. Based on a recursive inference algorithm for partition posteriors, an exact Bayesian learning method is introduced. We demonstrate on real data that exact learning of PDAGs can identify important relationships between variables that have not been discovered by previous graphical model learning methods.

Highlights

Bayesian networks have represented a standard workhorse in artificial intelligence and machine learning for more than two decades [15]
We demonstrate on real data that exact learning of partition DAG (PDAG) can identify important relationships between variables that have not been discovered by previous graphical model learning methods
Several of the introduced model classes [4, 9, 17, 20] have been built around the concept of context-specific independence (CSI), which is a natural generalization of conditional independence

Summary

Introduction

Bayesian networks have represented a standard workhorse in artificial intelligence and machine learning for more than two decades [15]. With a decreasing number of classes in a partition, an increasing set of additional local constraints are introduced to the conditional distribution of the corresponding node. Some of these will typically correspond to CSIs, but in general not all of them will be interpretable as such. The set of restrictions imposed by a collection of CSIs defines a partition of the parent outcome space into classes, where all elements in the same class induce an identical CPD. Bayesian networks with parent outcome partitions enjoy similar flexibility in terms of parameter restrictions as probabilistic decision graphs, but are based on a different representation that enables efficient learning algorithms to be developed. The criterion provides a straightforward way of making an arbitrary partition CSI-consistent by removing as few equality constraints as possible

Bayesian score for PDAGs

Exact PDAG learning algorithm

Discussion