Abstract

The Minimum Mutual Information (MinMI) Principle provides the least committed, maximum-joint-entropy (ME) inferential law that is compatible with prescribed marginal distributions and empirical cross constraints. Here, we estimate MI bounds (the MinMI values) generated by constraining sets Tcr comprehended by mcr linear and/or nonlinear joint expectations, computed from samples of N iid outcomes. Marginals (and their entropy) are imposed by single morphisms of the original random variables. N-asymptotic formulas are given both for the distribution of cross expectation’s estimation errors, the MinMI estimation bias, its variance and distribution. A growing Tcr leads to an increasing MinMI, converging eventually to the total MI. Under N-sized samples, the MinMI increment relative to two encapsulated sets Tcr1 ⊂ Tcr2 (with numbers of constraints mcr1<mcr2 ) is the test-difference δH = Hmax 1, N - Hmax 2, N ≥ 0 between the two respective estimated MEs. Asymptotically, δH follows a Chi-Squared distribution 1/2NΧ2 (mcr2-mcr1) whose upper quantiles determine if constraints in Tcr2/Tcr1 explain significant extra MI. As an example, we have set marginals to being normally distributed (Gaussian) and have built a sequence of MI bounds, associated to successive non-linear correlations due to joint non-Gaussianity. Noting that in real-world situations available sample sizes can be rather low, the relationship between MinMI bias, probability density over-fitting and outliers is put in evidence for under-sampled data.

Highlights

  • This paper addresses the problem of estimating the MI conveyed by the least committed, inferential law (say the conditional probability density function pdf (Y | X ) between random variables RVs Y, X ), which is compatible with prescribed marginal distributions and a set Tcr of mcr empirical non-redundant cross constraints

  • This paper presents theoretical formulas for statistics of estimation errors of information theoretical measures

  • This is quite relevant because finite samples can apparently exhibit artificial statistical structures leading to negatively biased estimations of Entropy or positively biased estimations of Mutual Information

Read more

Summary

The State of the Art

The seminal work of Shannon on Information Theory [1] gave rise to the concept of Mutual. Where H max,N is the ME estimation issued from N-sized samples of iid outcomes Those errors are roughly similar to those of MI and entropy generic estimator’s errors (see [13,14] for a thorough review and performance comparisons between MI estimators). Their mean (bias), variance and higher-order moments are written in terms of N 1 powers, covering intermediate and asymptotic N ranges [15], with specific applications in neurophysiology [16,17,18]. Estimators (e.g., kernel density estimators, adaptive or non-adaptive grids, nearest neighbors) and others specially designed for small samples [21,22]

The Rationale of the Paper
Imposing Marginal PDFs
The Formalism
A Theorem about the MinMI Covariance Matrix
Gaussian and Non-Gaussian MI
Estimators of the Minimum MI from Data and Their Errors
Generic Properties
The Effects of Morphisms and Bivariate Sampling
Errors of the Estimators of Polynomial Moments under Gaussian Distributions
Statistical Modeling of Moment Estimation Errors
Significance Tests of MinMI Thresholds
Significance Tests of the Gaussian and Non-Gaussian MI
Error and Significance Tests of the Gaussian MI
Error and Significance Tests of the Non-Gaussian MI
Validation of Significance Tests by Monte-Carlo Experiments
MI Estimation from Under-Sampled Data
Findings
Discussion and Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call