Learning Bayesian Networks That Enable Full Propagation of Evidence

Anthony C. Constantinou

doi:10.1109/access.2020.3006472

Anthony C. Constantinou

Open Access

https://doi.org/10.1109/access.2020.3006472

Copy DOI

Abstract

This paper builds on recent developments in Bayesian network (BN) structure learning under the controversial assumption that the input variables are dependent. This assumption can be viewed as a learning constraint geared towards cases where the input variables are known or assumed to be dependent. It addresses the problem of learning multiple disjoint subgraphs that do not enable full propagation of evidence. This problem is highly prevalent in cases where the sample size of the input data is low with respect to the dimensionality of the model, which is often the case when working with real data. The paper presents a novel hybrid structure learning algorithm, called SaiyanH, that addresses this issue. The results show that this constraint helps the algorithm to estimate the number of true edges with higher accuracy compared to the state-of-the-art. Out of the 13 algorithms investigated, the results rank SaiyanH 4 th in reconstructing the true DAG, with accuracy scores lower by 8.1% (F1), 10.2% (BSF), and 19.5% (SHD) compared to the top ranked algorithm, and higher by 75.5% (F1), 118% (BSF), and 4.3% (SHD) compared to the bottom ranked algorithm. Overall, the results suggest that the proposed algorithm discovers satisfactorily accurate connected DAGs in cases where other algorithms produce multiple disjoint subgraphs that often underfit the true graph.

Highlights

A Bayesian Network (BN) is a type of a probabilistic graphical model introduced by Pearl [1] [2]
If we assume that the arcs between nodes represent causation, the BN is viewed as a Causal Bayesian Network (CBN)
A CBN can only be represented by a unique Directed Acyclic Graph (DAG), whereas a BN that is not viewed as a causal model can be be represented by a Completed Partial Directed Acyclic Graph (CPDAG)

Summary

INTRODUCTION

A Bayesian Network (BN) is a type of a probabilistic graphical model introduced by Pearl [1] [2]. If we assume that the arcs between nodes represent causation, the BN is viewed as a Causal Bayesian Network (CBN). BNs have emerged as one of the most successful approaches for reasoning under uncertainty This is partly because they enable decision makers to reason with transparent causal assumptions that offer solutions that go beyond prediction. A CBN enables decision makers to reason about intervention and counterfactuals On this basis, the focus of this paper is on the reconstruction of the true causal DAG, as opposed to the reconstruction of a graph that forms part of the equivalence class of the true DAG (i.e., a CPDAG). The problem of structure learning is considerably more challenging than that of parameter learning This is because searching for the optimal graph represents an NP-Hard problem where some instances are much harder than others [3].

Constantinou

THE ALGORITHM

Phase 1

Phase 2

Phase 3

Computational complexity

Scoring metrics

Case studies

Structure learning algorithms considered

Accuracy of the learned graphs

Analysis of the edges and independent subgraphs

Pathfinder