Abstract

In principle, information theory could provide useful metrics for statistical inference. In practice this is impeded by divergent assumptions: Information theory assumes the joint distribution of variables of interest is known, whereas in statistical inference it is hidden and is the goal of inference. To integrate these approaches we note a common theme they share, namely the measurement of prediction power. We generalize this concept as an information metric, subject to several requirements: Calculation of the metric must be objective or model-free; unbiased; convergent; probabilistically bounded; and low in computational complexity. Unfortunately, widely used model selection metrics such as Maximum Likelihood, the Akaike Information Criterion and Bayesian Information Criterion do not necessarily meet all these requirements. We define four distinct empirical information metrics measured via sampling, with explicit Law of Large Numbers convergence guarantees, which meet these requirements: Ie, the empirical information, a measure of average prediction power; Ib, the overfitting bias information, which measures selection bias in the modeling procedure; Ip, the potential information, which measures the total remaining information in the observations not yet discovered by the model; and Im, the model information, which measures the model’s extrapolation prediction power. Finally, we show that Ip + Ie, Ip + Im, and Ie — Im are fixed constants for a given observed dataset (i.e. prediction target), independent of the model, and thus represent a fundamental subdivision of the total information contained in the observations. We discuss the application of these metrics to modeling and experiment planning.

Highlights

  • The Need for Information Metrics for Statistical and Scientific InferenceInformation theory as formulated by Shannon [1], Kolmogorov and others provides an elegant and general measure of information that connects variables

  • In this paper we define a set of statistical inference metrics that constitute statistical inference proxies for the fundamental metrics of information theory. We show that they are vitally useful for statistical inference, and highlight how they differ from standard statistical inference metrics such as Maximum Likelihood, Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC)

  • We present a series of metrics that address distinct aspects of statistical inference:

Read more

Summary

Introduction

The Need for Information Metrics for Statistical and Scientific InferenceInformation theory as formulated by Shannon [1], Kolmogorov and others provides an elegant and general measure of information (or coupling) that connects variables. Fisher defined the prediction power of a model Ψ for an observable variable X in terms of the total likelihood of a sample of independent and identically distributed (I.I.D.) draws X1 , X2 , ...Xn p(X1 , X2 , ...Xn |Ψ) = We use the unbiased estimator Le to define the empirical information, a signed measure of prediction power relative to the uninformative distribution p(X): n

Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.