Abstract

We consider three different approaches to define natural Riemannian metrics on polytopes of stochastic matrices. First, we define a natural class of stochastic maps between these polytopes and give a metric characterization of Chentsov type in terms of invariance with respect to these maps. Second, we consider the Fisher metric defined on arbitrary polytopes through their embeddings as exponential families in the probability simplex. We show that these metrics can also be characterized by an invariance principle with respect to morphisms of exponential families. Third, we consider the Fisher metric resulting from embedding the polytope of stochastic matrices in a simplex of joint distributions by specifying a marginal distribution. All three approaches result in slight variations of products of Fisher metrics. This is consistent with the nature of polytopes of stochastic matrices, which are Cartesian products of probability simplices. The first approach yields a scaled product of Fisher metrics; the second, a product of Fisher metrics; and the third, a product of Fisher metrics scaled by the marginal distribution.

Highlights

  • The Riemannian structure of a function’s domain has a crucial impact on the performance of gradient optimization methods, especially in the presence of plateaus and local maxima

  • It has been observed that following the natural gradient with respect to the Fisher information metric, instead of the Euclidean metric, can significantly alleviate the plateau problem [1,2]

  • An important argument was given by Chentsov [4], who showed that the Fisher information metric is the only metric on probability spaces for which certain natural statistical embeddings, called Markov morphisms, are isometries

Read more

Summary

Introduction

The Riemannian structure of a function’s domain has a crucial impact on the performance of gradient optimization methods, especially in the presence of plateaus and local maxima. This embedding can be used to pull back geometric structures from the probability simplex to the polytope, including Riemannian metrics, affine connections, divergences, etc This approach has been considered in [9] as a way to define low-dimensional families of conditional probability distributions. In this simple example, the weighted product metric gives the best asymptotic rate of convergence, under the assumption that the weights are optimally chosen. Appendix B contains the proofs of the results from Section 4

Preliminaries
The Results of Campbell and Lebanon
X uab uac
Invariance Metric Characterizations for Conditional Polytopes
Stochastic Embeddings of Conditional Polytopes
Invariance Characterization
The Fisher Metric on Polytopes and Point Configurations
Exponential Families and Polytopes
Invariance Fisher Metric Characterizations for Polytopes
Independence Models and Conditional Polytopes
X uix vix
Weighted Product Metrics for Conditional Models
Replicator Equations
Extension of the Replicator Equations to Stochastic Matrices
The Example of Mean Fitness
Conclusions
Conditions for Positive Definiteness
Proofs of the Invariance Characterization
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call