Parallel structural learning of Bayesian networks: Iterative divide and conquer algorithm based on structural fusion

Jorge D Laborda,José A Gámez,Pablo Torrijos,José M Puerta

doi:10.1016/j.knosys.2024.111840

Abstract

Learning Bayesian Networks (BNs) from high-dimensional data is a complex and time-consuming task. Although the literature includes approaches based on horizontal (instances) or vertical (variables) partitioning, none can guarantee the same theoretical properties as the Greedy Equivalence Search (GES) algorithm, except those based on the GES algorithm itself. This paper proposes a distributed BN learning algorithm that uses GES as the local learning algorithm, ensuring the same theoretical properties as GES but requiring less CPU time. The two main novelties in our proposed method are (1) the distribution of the set of possible edges among local learning processes, which are constrained to only use its local edge set; and (2) the use of BN fusion to aggregate the networks learned constrained to local edge sets. The algorithm is iterative, and at each step, the last aggregated network is used as the starting point by each local BN process. After a comprehensive experimental evaluation, the results show that the proposed algorithm (pGES) obtains networks of equal or better quality than GES in less computational time. This improvement is especially noticeable in high-dimensional BNs.

Full Text