A New Algorithm for Learning Large Bayesian Network Structure From Discrete Data

Weiping Zhang,Ziqiang Xu,Yu Chen,Jing Yang

doi:10.1109/access.2019.2937581

Abstract

Learning the structure of Bayesian networks (BNs) from high dimensional discrete data is common nowadays but a challenging task, due to the large parameter space, the acyclicity constraint placed on the graphical structures and the difficulty in searching for a sparse structure. In this article, we propose a sparse structure learning algorithm (SSLA) to solve this problem. The algorithm uses the negative log-likelihood function of multi-logit regression as a loss function, adding the adaptive group lasso as a penalty term for sparsity, with a new penalty term to ensure that the learned graph is a directed acyclic graph. A block coordinate descent algorithm (BCD) combining with the alternating direction multiplier method (ADMM) algorithm is developed to solve the proposed model. The learned graph is proved theoretically to be a Bayesian network. In order to evaluate the proposed SSLA and compare with its competitors, we conducted intensive simulation studies and applied them to the benchmark Bayesian networks. The results indicate that the SSLA is superior to the hill climbing (HC) algorithm, the CD algorithm and the BFO-B algorithm respectively, and is competitive with K2 algorithm when the order of the nodes is given.

Highlights

A Bayesian network, known as belief network, is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG)
We proved theoretically the existence of tuning parameter to ensure that the learned Bayesian network is a DAG
Our simulation studies and experiments on moderate to large networks with different sample sizes show that the sparse structure learning algorithm (SSLA) outperforms the hill climbing (HC) algorithm, the CD algorithm and the BFO-B algorithm

Summary

INTRODUCTION

Known as belief network, is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). The hybrid Bayesian network structure learning approach [23], [24] attempts to combine the advantages of constraintbased and score-based algorithms, it employs conditional independence tests to narrow the search space, and uses a score-based method to learn the DAG structure. The score-based learning method is generally considered to get the best sparse Bayesian network structure for high dimensional data. The CD algorithm [34] is a relatively new method for learning discrete Bayesian networks, which estimate the structure of graphical models through appending the group lasso [35] penalty term to the likelihood. For a large number of categorical variables, the dummy coding produces very high dimensionality and increases computational complexity It is non-trivial to address these issues with multivariate logistic regression for discrete high-dimensional data, in particular new algorithms have to be developed to accommodate the difficulties.

PROBLEM FORMULATION

Thus the resulting iterations are the following:

THE TUNING PARAMETERS

TIME COMPLEXITY ANALYSIS

SIMULATION STUDY

APPLICATIONS TO BENCHMARK NETWORKS

Findings

CONCLUSION

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2019
Citations: 3	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A New Algorithm for Learning Large Bayesian Network Structure From Discrete Data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

A stochastic variance-reduced coordinate descent algorithm for learning sparse Bayesian network from discrete high-dimensional data
Nazanin Shajoonnezhad ... Amin Nikanjam
International Journal of Machine Learning and Cybernetics | VOL. 14
Nazanin Shajoonnezhad, et. al.Nazanin Shajoonnezhad ... Amin Nikanjam
01 Oct 2022
International Journal of Machine Learning and Cybernetics | VOL. 14

Graph Regularized Subspace Clustering via Low-Rank Decomposition
Aimin Jiang ... Yanping Zhu
-
Aimin Jiang, et. al.Aimin Jiang ... Yanping Zhu
24 Jan 2021
24 Jan 2021

A Bayesian network structural learning algorithm for calculating the failure probabilities of complex engineering systems with limited data
Yong Chen ... Lei Cai
Journal of Intelligent & Fuzzy Systems | VOL. 42
Yong Chen, et. al.Yong Chen ... Lei Cai
02 Feb 2022
Journal of Intelligent & Fuzzy Systems | VOL. 42

Bayesian network for monthly rainfall forecast: a comparison of K2 and MCMC algorithm
Ashutosh Sharma ... Manish Kumar Goyal
International Journal of Computers and Applications | VOL. 38
Ashutosh Sharma, et. al.Ashutosh Sharma ... Manish Kumar Goyal
21 Sep 2016
International Journal of Computers and Applications | VOL. 38

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A New Algorithm for Learning Large Bayesian Network Structure From Discrete Data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access