Minimum Description Length Codes Are Critical.

Ryan John Cubero,Yasser Roudi,Matteo Marsili

doi:10.3390/e20100755

Abstract

In the Minimum Description Length (MDL) principle, learning from the data is equivalent to an optimal coding problem. We show that the codes that achieve optimal compression in MDL are critical in a very precise sense. First, when they are taken as generative models of samples, they generate samples with broad empirical distributions and with a high value of the relevance, defined as the entropy of the empirical frequencies. These results are derived for different statistical models (Dirichlet model, independent and pairwise dependent spin models, and restricted Boltzmann machines). Second, MDL codes sit precisely at a second order phase transition point where the symmetry between the sampled outcomes is spontaneously broken. The order parameter controlling the phase transition is the coding cost of the samples. The phase transition is a manifestation of the optimality of MDL codes, and it arises because codes that achieve a higher compression do not exist. These results suggest a clear interpretation of the widespread occurrence of statistical criticality as a characterization of samples which are maximally informative on the underlying generative process.

Highlights

It is not infrequent to find empirical data which exhibits broad frequency distributions in the most disparate domains
We find that Normalized Maximum Likelihood (NML) are critical in a very precise sense, because they sit at a second order phase transition that separates typical from atypical behavior
The aim of this paper is to elucidate the properties of efficient representations of data corresponding to universal codes that arise in Minimum Description Length (MDL)

Summary

Introduction

It is not infrequent to find empirical data which exhibits broad frequency distributions in the most disparate domains. A straight line in the rank plot corresponds to a power law frequency distribution, where the number of outcomes that are observed k times behave as mk ∼ k−μ−1. It has recently been suggested that broad distributions arise from efficient representations, i.e., when the data samples relevant variables, which are those carrying the maximal amount of information on the generative process [7,8,9].

Objectives

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Entropy	Publication Date: Oct 1, 2018
Citations: 17	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Minimum Description Length Codes Are Critical.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Entropy

Lead the way for us

Similar Papers

The Minimum Description Length Principle
Peter D Grünwald
-
Peter D GrünwaldPeter D Grünwald
23 Mar 2007
23 Mar 2007

Enhanced minimum description length preprocessing of time series trajectories
Gajanan Gawde ... Jyoti Pawar
-
Gajanan Gawde, et. al.Gajanan Gawde ... Jyoti Pawar
01 Mar 2017
01 Mar 2017

An analysis of the difference of code lengths between two-step codes based on MDL principle and Bayes codes
M Goto ... T Matsushima
IEEE Transactions on Information Theory | VOL. 47
M Goto, et. al.M Goto ... T Matsushima
01 Mar 2001
IEEE Transactions on Information Theory | VOL. 47

Modeling multisource remote sensing image classifier based on the MDL principle: Theoretical aspects
Qian Yin ... Zhi-Yong Yuan
-
Qian Yin, et. al. Qian Yin ... Zhi-Yong Yuan
01 Jul 2008
01 Jul 2008

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Minimum Description Length Codes Are Critical.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Entropy