Abstract

The Markov blanket (MB) represents a crucial concept in a Bayesian network (BN) and is theoretically the optimal solution to the feature selection problem. Methods based on conditional independence (CI) tests are prevalent for MB discovery. Currently, the main challenge is how to improve both the efficiency and effectiveness of this type of method. In this paper, we propose a novel divide-and-conquer discovery algorithm, loose-to-strict MB (LSMB), to discover MBs faster while maintaining high accuracy. LSMB first discovers the approximate parent–child (PC) and spouse sets of a target variable via the loose CI test strategy, a constraint for the condition set of CI tests. Then, by strict CI tests, i.e., without constraint for the size of the condition set, LSMB first removes non-MB nodes in the discovered approximate PC set and then categorizes and removes non-MB nodes in the discovered approximate spouse set. Finally, LSMB combines the discovered sets to obtain the MB of the target. Experiments on benchmark BN datasets show that LSMB can improve the efficiency of MB discovery while maintaining higher accuracy than the state-of-the-art MB discovery algorithms, and experiments on real-world datasets demonstrate the excellent performance of LSMB in feature selection.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call