Abstract
We discuss a special class of generalized divergence measures by the use of generator functions. Any divergence measure in the class is separated into the difference between cross and diagonal entropy. The diagonal entropy measure in the class associates with a model of maximum entropy distributions; the divergence measure leads to statistical estimation via minimization, for arbitrarily giving a statistical model. The dualistic relationship between the maximum entropy model and the minimum divergence estimation is explored in the framework of information geometry. The model of maximum entropy distributions is characterized to be totally geodesic with respect to the linear connection associated with the divergence. A natural extension for the classical theory for the maximum likelihood method under the maximum entropy model in terms of the Boltzmann-Gibbs-Shannon entropy is given. We discuss the duality in detail for Tsallis entropy as a typical example.
Highlights
Information divergence plays a central role in the understanding of integrating statistics, information science, statistical physics and machine learning
The U -loss function is given by an empirical approximation for U -divergence based on a given dataset under a statistical model, in which the U -estimator is defined by minimization of the U -loss function on the parameter space
We focus on the geometry generated by the
Summary
Information divergence plays a central role in the understanding of integrating statistics, information science, statistical physics and machine learning. We discuss a generalized entropy and divergence measures with applications in statistical models and estimation. We observe a dualistic property associated with U -divergence between statistical model and estimation. The U -loss function is given by an empirical approximation for U -divergence based on a given dataset under a statistical model, in which the U -estimator is defined by minimization of the U -loss function on the parameter space. D0 (f, g) generating a pair of an exponential family M (e) and the minus log-likelihood function. This aspect is characterized as a minimax game between a decision maker and Nature.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have