Information gain and a general measure of correlation

John T Kent

doi:10.1093/biomet/70.1.163

Abstract

SUMMARY Given a parametric model of dependence between two random quantities, X and Y, the notion of information gain can be used to define a measure of correlation. This definition of correlation generalizes both the usual product-moment correlation coeffi- cient for the bivariate normal model and the multiple correlation coefficient in the standard linear regression model. The use of this information-based correlation in a descriptive statistical analysis is examined and several examples are given. If the dependence between two random quantities, X and Y, is modelled parametri- cally, then the concept of information gain can be used to define a measure of correlation. This correlation coefficient can appear in two possible contexts, depending on whether one models the joint distribution of X and Y, or just the conditional distribution of Y given X. An important motivating feature for this information-based correlation coefficient is the fact that it generalizes both the usual product-moment correlation coefficient for the bivariate normal model and the usual multiple correlation coefficient for the standard multiple regression model with normal errors. Our intuition is well developed for these usual correlation coefficients, and hopefully our intuition will still be applicable for-the information-based correlation in more general modelling situations of parametric dependence. Further, since our correlation coefficient is based on information gain, we might hope to extend our intuition to interpret the information gain in any statistical modelling situation where we want to assess how much better a more complicated model is than a simpler model. The concept of information gain for general statistical models is described in ? 2. This concept is then used to define an information-based measure of correlation; the joint case is covered in ? 3 and the conditional case in ? 4. Estimation of the correlation coefficient is carried out by estimating the corresponding information gain; see ?? 5-7. The use of information gain for the purpose of model choice in a descriptive statistical analysis is discussed in ? 8 and a comparison between our approach and Akaike's information criterion is given in ?9. Some examples of the use of this information-based correlation are given in ?? 10 and 11.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Information gain and a general measure of correlation

Abstract

Talk to us

Similar Papers

More From: Biometrika

Lead the way for us

Journal: Biometrika	Publication Date: Jan 1, 1983
Citations: 370

Similar Papers

Editor's evaluation: Robust and Efficient Assessment of Potency (REAP) as a quantitative tool for dose-response curve estimation
Philip Boonstra
-
Philip BoonstraPhilip Boonstra
09 May 2022
09 May 2022

모형 선택에서의 수정된 AIC 사용에 대하여
Eunjung Song ... Sungho Won
Korean Journal of Applied Statistics | VOL. 30
Eunjung Song, et. al.Eunjung Song ... Sungho Won
28 Feb 2017
Korean Journal of Applied Statistics | VOL. 30

Model Selection for Linear Mixed Models Using Predictive Criteria
Jun Wang ... G Bruce Schaalje
Communications in Statistics - Simulation and Computation | VOL. 38
Jun Wang, et. al.Jun Wang ... G Bruce Schaalje
24 Feb 2009
Communications in Statistics - Simulation and Computation | VOL. 38

Bayesian Hierarchical Models for Cost-Effectiveness Analyses that Use Data from Cluster Randomized Trials
Richard Grieve ... Simon G Thompson
Medical Decision Making | VOL. 30
Richard Grieve, et. al.Richard Grieve ... Simon G Thompson
12 Aug 2009
Medical Decision Making | VOL. 30

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Information gain and a general measure of correlation

Abstract

Talk to us

Similar Papers

More From: Biometrika