Extending the Extreme Physical Information to Universal Cognitive Models via a Confident Information First Principle

Xiaozhao Zhao,Dawei Song,Yuexian Hou,Wenjie Li

doi:10.3390/e16073670

Xiaozhao Zhao, Dawei Song + Show 2 more

Open Access

https://doi.org/10.3390/e16073670

Copy DOI

Journal: Entropy	Publication Date: Jul 1, 2014
Citations: 24	License type: CC BY 4.0

Affiliation: The Open University, Hong Kong Polytechnic University

Abstract

The principle of extreme physical information (EPI) can be used to derive many known laws and distributions in theoretical physics by extremizing the physical information loss K, i.e., the difference between the observed Fisher information I and the intrinsic information bound J of the physical phenomenon being measured. However, for complex cognitive systems of high dimensionality (e.g., human language processing and image recognition), the information bound J could be excessively larger than I (J ≫ I), due to insufficient observation, which would lead to serious over-fitting problems in the derivation of cognitive models. Moreover, there is a lack of an established exact invariance principle that gives rise to the bound information in universal cognitive systems. This limits the direct application of EPI. To narrow down the gap between I and J, in this paper, we propose a confident-information-first (CIF) principle to lower the information bound J by preserving confident parameters and ruling out unreliable or noisy parameters in the probability density function being measured. The confidence of each parameter can be assessed by its contribution to the expected Fisher information distance between the physical phenomenon and its observations. In addition, given a specific parametric representation, this contribution can often be directly assessed by the Fisher information, which establishes a connection with the inverse variance of any unbiased estimate for the parameter via the Cramér–Rao bound. We then consider the dimensionality reduction in the parameter spaces of binary multivariate distributions. We show that the single-layer Boltzmann machine without hidden units (SBM) can be derived using the CIF principle. An illustrative experiment is conducted to show how the CIF principle improves the density estimation performance.

Highlights

Information has been found to play an increasingly important role in physics
Based on information geometry (IG) [7], we introduce some choices of parameterizations for binary multivariate distributions with a given number of variables n, i.e., the open simplex of all probability distributions over binary vector x ∈ {0, 1}n
The confidence of parameters should be assessed according to their contributions to the expected information distance between the ideal distribution and its fluctuated observations

Summary

Introduction

As stated in Wheeler [1]: “All things physical are information-theoretic in origin and this is a participatory universe...Observer participancy gives rise to information; and information gives rise to physics”. Following this viewpoint, Frieden [2] unifies the derivation of physical laws in major fields of physics, from the Dirac equation to the Maxwell-Boltzmann velocity dispersion law, using the extreme physical information principle (EPI). The first quantity, I, measures the amount of information as a finite scalar implied by the data with some suitable measure [2] It is formally defined as the trace of the Fisher information matrix [3]

Objectives

Methods

Conclusion