Abstract
Discovering a correlation from one variable to another variable is of fundamental scientific and practical interest. While existing correlation measures are suitable for discovering average correlation, they fail to discover hidden or potential correlations. To bridge this gap, (i) we postulate a set of natural axioms that we expect a measure of potential correlation to satisfy; (ii) we show that the rate of information bottleneck, i.e., the hypercontractivity coefficient, satisfies all the proposed axioms; (iii) we provide a novel estimator to estimate the hypercontractivity coefficient from samples; and (iv) we provide numerical experiments demonstrating that this proposed estimator discovers potential correlations among various indicators of WHO datasets, is robust in discovering gene interactions from gene expression time series data, and is statistically more powerful than the estimators for other correlation measures in binary hypothesis testing of canonical examples of potential correlations.
Highlights
Measuring the strength of an association between two random variables is a fundamental topic of broad scientific interest
We provide a novel interpretation to the hypercontractivity coefficient as a measure of potential correlation by demonstrating that it satisfies a natural set of axioms such a measure is expected to obey
We show applications of our estimator of hypercontractivity coefficient in two important datasets: In Section 4.2, we demonstrate that it discovers hidden potential correlations among various national indicators in World Health Organization (WHO) datasets, including how aid is potentially correlated with the income growth
Summary
Measuring the strength of an association between two random variables is a fundamental topic of broad scientific interest. This intuition is made precise, where we formally define a natural notion of potential correlation (Axiom 6), and show that the rate of information bottleneck s( X; Y ) captures this potential correlation (Theorem 1) while other standard measures of correlation fail (Theorem 2) This ratio has only recently been identified as the hypercontractivity coefficient [11]. We prove that existing standard measures of correlation fail to satisfy the proposed axioms, and fail to capture canonical examples of potential p correlations captured by s( X; Y ) (Section 2.3) Another natural candidate is mutual information, but it is not clear how to interpret the value of mutual information as it is unnormalized, unlike all other measures of correlation which are between zero and one. We show empirically that the estimator of the hypercontractivity coefficient recovers this order accurately from a vastly smaller number of samples compared to other state-of-the art causal influence estimators
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have