Discovering Potential Correlations via Hypercontractivity

Hyeji Kim,Weihao Gao,Sewoong Oh,Sreeram Kannan,Pramod Viswanath

doi:10.3390/e19110586

Hyeji Kim, Weihao Gao + Show 3 more

Open Access

https://doi.org/10.3390/e19110586

Copy DOI

Abstract

Discovering a correlation from one variable to another variable is of fundamental scientific and practical interest. While existing correlation measures are suitable for discovering average correlation, they fail to discover hidden or potential correlations. To bridge this gap, (i) we postulate a set of natural axioms that we expect a measure of potential correlation to satisfy; (ii) we show that the rate of information bottleneck, i.e., the hypercontractivity coefficient, satisfies all the proposed axioms; (iii) we provide a novel estimator to estimate the hypercontractivity coefficient from samples; and (iv) we provide numerical experiments demonstrating that this proposed estimator discovers potential correlations among various indicators of WHO datasets, is robust in discovering gene interactions from gene expression time series data, and is statistically more powerful than the estimators for other correlation measures in binary hypothesis testing of canonical examples of potential correlations.

Highlights

Measuring the strength of an association between two random variables is a fundamental topic of broad scientific interest
We provide a novel interpretation to the hypercontractivity coefficient as a measure of potential correlation by demonstrating that it satisfies a natural set of axioms such a measure is expected to obey
We show applications of our estimator of hypercontractivity coefficient in two important datasets: In Section 4.2, we demonstrate that it discovers hidden potential correlations among various national indicators in World Health Organization (WHO) datasets, including how aid is potentially correlated with the income growth

Summary

Introduction

Measuring the strength of an association between two random variables is a fundamental topic of broad scientific interest. This intuition is made precise, where we formally define a natural notion of potential correlation (Axiom 6), and show that the rate of information bottleneck s( X; Y ) captures this potential correlation (Theorem 1) while other standard measures of correlation fail (Theorem 2) This ratio has only recently been identified as the hypercontractivity coefficient [11]. We prove that existing standard measures of correlation fail to satisfy the proposed axioms, and fail to capture canonical examples of potential p correlations captured by s( X; Y ) (Section 2.3) Another natural candidate is mutual information, but it is not clear how to interpret the value of mutual information as it is unnormalized, unlike all other measures of correlation which are between zero and one. We show empirically that the estimator of the hypercontractivity coefficient recovers this order accurately from a vastly smaller number of samples compared to other state-of-the art causal influence estimators

Axiomatic Approach to Measure Potential Correlations

Axioms for Potential Correlation

The Hypercontractivity Coefficient Satisfies All Axioms

Standard Correlation Coefficients Violate the Axioms

Mutual Information Violates the Axioms

Hypercontractivity Ribbon

Multidimensional X and Y

Estimator of the Hypercontractivity Coefficient from Samples

Experimental Results

Synthetic Data

Real Data

How Hypercontractivity Changes as We Remove Outliers

Hypercontractivity Detecting an Outlier

Gene Pathway Recovery From Single Cell Data

Proof of Proposition 1 p p

Proof of Theorem 1

Proof of Theorem 2

Proof of Proposition 2

Noisy Discrete Rare Correlation in Example 3

Proof of Proposition 4 p

Proof of Theorem 3

Proof of Lemma 2

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Entropy	Publication Date: Nov 2, 2017
Citations: 3	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Discovering Potential Correlations via Hypercontractivity

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Entropy

Lead the way for us

Similar Papers

Cell cycle time series gene expression data encoded as cyclic attractors in Hopfield systems.
Anthony Szedlak ... Andrey Rzhetsky
PLoS computational biology | VOL. 13
Anthony Szedlak, et. al.Anthony Szedlak ... Andrey Rzhetsky
17 Nov 2017
PLoS computational biology | VOL. 13

An HMM-based hierarchical clustering method for gene expression time series data
Guoqing Zhao ... Wei Deng
-
Guoqing Zhao, et. al. Guoqing Zhao ... Wei Deng
01 Sep 2010
01 Sep 2010

Mutual Information Based on Multiple Level Discretization Network Inference from Time Series Gene Expression Profiles
Cao-Tuan Anh ... Yung-Keun Kwon
Applied sciences | VOL. 13
Cao-Tuan Anh, et. al.Cao-Tuan Anh ... Yung-Keun Kwon
31 Oct 2023
Applied sciences | VOL. 13

STEM: a tool for the analysis of short time series gene expression data
Jason Ernst ... Ziv Bar-Joseph
BMC bioinformatics | VOL. 7
Jason Ernst, et. al.Jason Ernst ... Ziv Bar-Joseph
05 Apr 2006
BMC bioinformatics | VOL. 7

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Discovering Potential Correlations via Hypercontractivity

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Entropy