<p>Local discovery plays an important role in Bayesian networks (BNs), mainly addressing PC (parents and children) discovery and MB (Markov boundary) discovery. In this paper, we considered the problem of large local discovery. First, we focused on an assumption about conditional independence (CI) tests: We explained why it was unreasonable to assume all CI tests were reliable in large local discovery, studied how the power and reliability of CI tests changed with the data size and the number of degrees of freedom, and then modified the assumption about CI tests in a more reasonable way. Second, we concentrated on improving local discovery algorithms: We posed the problem of premature termination of the forward search, analyze why it arose frequently in large local discovery when implementing the existing local discovery algorithms, put forward an idea of preventing the premature termination of forward search called information connection (IC), and used IC to build a novel algorithm called ICPC; the theoretical basis of ICPC was detailedly presented. In addition, a more steady incremental algorithm as the subroutine of ICPC was proposed. Third, the way of breaking ties among equal associations was considered and optimized. Finally, we conducted a benchmarking study by means of six synthetic BNs from various domains. The experimental results revealed the applicability and superiority of ICPC in solving the problem of premature termination of the forward search that arose frequently in large local discovery.</p>
Read full abstract