Tractable learning of Bayesian networks from partially observed data

Marco Benjumeda,Sergio Luengo-Sanchez,Pedro Larrañaga,Concha Bielza

doi:10.1016/j.patcog.2019.02.025

Marco Benjumeda, Sergio Luengo-Sanchez + Show 2 more

Open Access

https://doi.org/10.1016/j.patcog.2019.02.025

Copy DOI

Journal: Pattern Recognition	Publication Date: Feb 23, 2019
Citations: 8	License type: cc-by-nc-nd

Affiliation: Universidad Politécnica de Madrid

Abstract

The majority of real-world problems require addressing incomplete data. The use of the structural expectation-maximization algorithm is the most common approach toward learning Bayesian networks from incomplete datasets. However, its main limitation is its demanding computational cost, caused mainly by the need to make an inference at each iteration of the algorithm. In this paper, we propose a new method with the purpose of guaranteeing the efficiency of the learning process while improving the performance of the structural expectation-maximization algorithm. We address the first objective by applying an upper bound to the treewidth of the models to limit the complexity of the inference. To achieve this, we use an efficient heuristic to search the space of the elimination orders. For the second objective, we study the advantages of directly computing the score with respect to the observed data rather than an expectation of the score, and provide a strategy to efficiently perform these computations in the proposed method. We perform exhaustive experiments on synthetic and real-world datasets of varied dimensionalities, including datasets with thousands of variables and hundreds of thousands of instances. The experimental results support our claims empirically.

Full Text