Abstract

While tests for pairwise conditional independence of random variables have been devised, testing joint conditional independence of several random variables seems to be a challenge in general. Restriction to categorical random variables implies in particular that their common distribution may initially be thought of as contingency table, and then in terms of a log-linear model. Thus, Hammersley–Clifford theorem applies, and provides insight in the factorization of the log-linear model corresponding to assumptions of independence or conditional independence. Such assumptions simplify the full joint log-linear model, and in turn any conditional distribution. If the joint log-linear model corresponding to the assumption of joint conditional independence given the conditioning variable is not sufficiently large to explain some data according to a standard log-likelihood test, its null–hypothesis of joint conditional independence may be rejected with respect to some significance level. Enlarging the log-linear model by some product terms of variables and running the log-likelihood test on different models may provide insight which variables are lacking conditional independence. Since the joint distribution determines any conditional distribution, the series of tests eventually provides insight which variables and product terms a proper logistic regression model should comprise.

Highlights

  • Let Z be a random vector of categorical random variables Zl, l = 0, ... , m, i.e., Z = (Z0, Z1, ... , Zm)T

  • It is completely characterized by its distribution pκ = PZ(sκ ) = P(Z = sκ ) = P Z0, ... , Zm) = (sk0, ... , skm with the multi-index κ = (k0, ... , km), where skl with kl = 1, ... , Kl denotes all possible categories of the categorical random variable Zl, l = 0, ... , m

  • If the random variables Zl, l = 1, ... , m, are conditionally independent given Z0, the joint conditional probability of any subset of random variables Zl given Z0 can be factorized into the product of the individual conditional probabilities, i.e., and in particular

Read more

Summary

Introduction

Conditional independence is a probabilistic approach to causality (Suppes 1970; Dawid 1979, 2004, 2007; Spohn 1980, 1994; Pearl 2009; Chalak and White 2012) while for instance correlation is obviously not as it is a symmetric relationship.

Schaeben
From Contingency Tables to Log-Linear Models
Independence, Conditional Independence of Random Variables
Logistic Regression, and Its Special Case of Weights-of-Evidence
Hammersley–Clifford Theorem
Testing Joint Conditional Independence of Categorical Random Variables
Conditional Distribution, Logistic Regression
Practical Applications
The Data Set BRY
The Data Set SCCI
Discussion and Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.