Abstract
This is an up-to-date introduction to and overview of the Minimum Description Length (MDL) Principle, a theory of inductive inference that can be applied to general problems in statistics, machine learning and pattern recognition. While MDL was originally based on data compression ideas, this introduction can be read without any knowledge thereof. It takes into account all major developments since 2007, the last time an extensive overview was written. These include new methods for model selection and averaging and hypothesis testing, as well as the first completely general definition of MDL estimators. Incorporating these developments, MDL can be seen as a powerful extension of both penalized likelihood and Bayesian approaches, in which penalization functions and prior distributions are replaced by more general luckiness functions, average-case methodology is replaced by a more robust worst-case approach, and in which methods classically viewed as highly distinct, such as AIC versus BIC and cross-validation versus Bayes can, to a large extent, be viewed from a unified perspective.
Highlights
The Minimum Description Length (MDL) Principle [Rissanen, 1978, 1989, Barron et al, 1998, Grünwald, 2007] is a theory of inductive inference that can be applied to general problems in statistics, machine learning and pattern recognition
Here we present, for the first time, the MDL Principle without resorting to information theory: all the material can be understood without any knowledge of data compression, which should make it a much easier read for statisticians and machine learning researchers novel to MDL
Over the last 10 years, there have been exciting developments — some of them very recent — which mostly resolve these issues. Incorporating these developments, MDL can be seen as a powerful extension of both penalized likelihood and Bayesian approaches, in which penalization functions and prior distributions are replaced by more general luckiness functions, average-case methodology is replaced by a more robust worst-case approach, and in which methods classically viewed as highly distinct, such as AIC vs BIC and crossvalidation vs Bayes can, to some extent, be viewed from a unified perspective; as such, this paper should be of interest to researchers working on the foundations of statistics and machine learning
Summary
The Minimum Description Length (MDL) Principle [Rissanen, 1978, 1989, Barron et al, 1998, Grünwald, 2007] is a theory of inductive inference that can be applied to general problems in statistics, machine learning and pattern recognition. Over the last 10 years, there have been exciting developments — some of them very recent — which mostly resolve these issues Incorporating these developments, MDL can be seen as a powerful extension of both penalized likelihood and Bayesian approaches, in which penalization functions and prior distributions are replaced by more general luckiness functions, average-case methodology is replaced by a more robust worst-case approach, and in which methods classically viewed as highly distinct, such as AIC vs BIC and crossvalidation vs Bayes can, to some extent, be viewed from a unified perspective; as such, this paper should be of interest to researchers working on the foundations of statistics and machine learning. We introduce them in a concise yet self-contained way, including substantial underlying motivation, in Section 2, incorporating the extensions to and new insights into these basic building blocks that have been gathered over the last 10 years These include more general formulations of arguably the most fundamental universal code, the Normalized Maximum Likelihood (NML) Distribution, including faster ways to calculate it as well. We use θto denote more general estimators, and θv to denote what we call the MDL estimator with luckiness function v, see (5)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.