Abstract

Score-based algorithms that learn Bayesian Network (BN) structures provide solutions ranging from different levels of approximate learning to exact learning. Approximate solutions exist because exact learning is generally not applicable to networks of moderate or higher complexity. In general, approximate solutions tend to sacrifice accuracy for speed, where the aim is to minimise the loss in accuracy and maximise the gain in speed. While some approximate algorithms are optimised to handle thousands of variables, these algorithms may still be unable to learn such high dimensional structures. Some of the most efficient score-based algorithms cast the structure learning problem as a combinatorial optimisation of candidate parent sets. This paper explores a strategy towards pruning the size of candidate parent sets, and which could form part of existing score-based algorithms as an additional pruning phase aimed at high dimensionality problems. The results illustrate how different levels of pruning affect the learning speed relative to the loss in accuracy in terms of model fitting, and show that aggressive pruning may be required to produce approximate solutions for high complexity problems.

Highlights

  • A Bayesian Network (BN) [1] is a probabilistic graphical model represented by a DirectedAcyclic Graph (DAG)

  • We investigate the effect of different levels of pruning on legal Candidate Parent Sets (CPSs)

  • The experiments presented are based on BNs and relevant data that are available on the GOBNILP website; link for the networks used in Section 3.1, and link

Read more

Summary

Introduction

A Bayesian Network (BN) [1] is a probabilistic graphical model represented by a Directed. The algorithms that fall in the former category are generally based on efficient heuristics such as hill-climbing, but tend to stuck in local optimum solutions, thereby offering an approximate solution to the problem of BNSL While algorithms of the latter category are generally approximate, they can be more adjusted to offer exact learning solutions that guarantee to return a graph with score not lower than the global maximum score. This paper focuses on this latter subcategory of score-based learning Algorithms such as the Integer Linear Programming (ILP) [17,18] explore local networks in the form of the Candidate Parent Sets (CPSs), usually up to a bounded maximum in-degree, and offer an exact solution.

Problem Statement and Methodology
Results
Pruning Legal CPSs of Moderate Complexity
Pruning Legal CPSs of High Complexity
Pruning Legal CPSs of Very High Complexity
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call