Abstract
The Hilbert-Schmidt Independence Criterion (HSIC) based on kernel functions is capable of detecting nonlinear dependencies between variables, making it a common method for association relationship mining. However, in situations with small samples, high dimensions, or noisy data, it may generate spurious associations, causing two unrelated variables to have certain scores. To address this issue, we propose a novel criterion, named as Pure Hilbert-Schmidt Independence Criterion (PHSIC). PHSIC is achieved by subtracting the mean HSIC obtained under random conditions from the original HSIC value. We demonstrate three significant advantages of PHSIC through theoretical and simulation experiments: (1) PHSIC has a baseline of zero, enhancing the interpretability of HSIC. (2) Compared to HSIC, PHSIC exhibits lower bias. (3) PHSIC enables a fairer comparison across different samples and dimensions. To validate the effectiveness of PHSIC, we apply it to multiple causal inference tasks to measure the independence between cause and residual. Experimental results demonstrate that the causal model based on PHSIC performs well compared to other methods in scenarios involving small sample sizes and noisy data, both in real and simulated datasets.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.