Abstract

Causal discovery from observational data is a fundamental problem. A large number of algorithms have been proposed over the years for that purpose, but they usually handle the data of a single type, either continuous or discrete variables only. Recently, a few causal structure discovery algorithms have been developed for mixed data types, and received many applications. In this paper, we propose a structural equation model for mixed data types, which allows the causal mechanisms to be nonlinear and can consequently model many read-world situations. We prove that the causal structure is identifiable from the data distribution generated by the model under certain conditions. Moreover, we propose a maximum likelihood estimator and develop an efficient order search algorithm benefiting from a novel method of order space cutting, which can handle several hundred variables. We adopt automatic relevance determination kernel-based variable selection after order learning to recover the causal structure. Experiments on synthetic datasets demonstrate the accuracy and scalability of our approach. Especially, we apply our method to publicly available causal-effect pairs and show its superiority in the causal direction identification of mixed causal pairs. In addition, we show that our method can sensibly recover causal relationships on a publicly available real dataset and a private real-world dataset.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call