Abstract

Local causal structure learning aims to discover and distinguish the direct causes and direct effects of a target variable. However, the state-of-the-art algorithms for local causal structure learning fail to perform well when dealing with missing data. The general approach is to fill in the missing data using imputation techniques before learning the local causal structure, but this method suffers from problems such as low accuracy, low efficiency, and instability. To address these issues, we propose a novel method for local causal structure learning with missing data, named misLCS. Firstly, we design an iterative data imputation method to obtain the complete and correct data from the missing data. Then, misLCS adopts a data subset strategy to get a data subset that variables are closely related to the target variable. Thirdly, within this data subset, misLCS constructs the local causal skeleton of the target variable using a mutual information-based feature selection method and orients the direction of edges using conditional independence tests and Meek rules. Finally, misLCS updates the missing data in preparation for the next iteration. This procedure continues until the direct causes and direct effects of the target variable have been identified. Our experiments on seven benchmark Bayesian networks and a real-world bioinformatics dataset, with a number of variables from 11 to 801, demonstrate that our algorithm achieves better accuracy than the existing local causal structure learning algorithms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call