Abstract
Learning the structure of Bayesian networks (BNs) from high dimensional discrete data is common nowadays but a challenging task, due to the large parameter space, the acyclicity constraint placed on the graphical structures and the difficulty in searching for a sparse structure. In this article, we propose a sparse structure learning algorithm (SSLA) to solve this problem. The algorithm uses the negative log-likelihood function of multi-logit regression as a loss function, adding the adaptive group lasso as a penalty term for sparsity, with a new penalty term to ensure that the learned graph is a directed acyclic graph. A block coordinate descent algorithm (BCD) combining with the alternating direction multiplier method (ADMM) algorithm is developed to solve the proposed model. The learned graph is proved theoretically to be a Bayesian network. In order to evaluate the proposed SSLA and compare with its competitors, we conducted intensive simulation studies and applied them to the benchmark Bayesian networks. The results indicate that the SSLA is superior to the hill climbing (HC) algorithm, the CD algorithm and the BFO-B algorithm respectively, and is competitive with K2 algorithm when the order of the nodes is given.
Highlights
A Bayesian network, known as belief network, is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG)
We proved theoretically the existence of tuning parameter to ensure that the learned Bayesian network is a DAG
Our simulation studies and experiments on moderate to large networks with different sample sizes show that the sparse structure learning algorithm (SSLA) outperforms the hill climbing (HC) algorithm, the CD algorithm and the BFO-B algorithm
Summary
Known as belief network, is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). The hybrid Bayesian network structure learning approach [23], [24] attempts to combine the advantages of constraintbased and score-based algorithms, it employs conditional independence tests to narrow the search space, and uses a score-based method to learn the DAG structure. The score-based learning method is generally considered to get the best sparse Bayesian network structure for high dimensional data. The CD algorithm [34] is a relatively new method for learning discrete Bayesian networks, which estimate the structure of graphical models through appending the group lasso [35] penalty term to the likelihood. For a large number of categorical variables, the dummy coding produces very high dimensionality and increases computational complexity It is non-trivial to address these issues with multivariate logistic regression for discrete high-dimensional data, in particular new algorithms have to be developed to accommodate the difficulties.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.