From a smooth, strictly convex function phi: Rn --> R, a parametric family of divergence function Dphi(alpha) may be introduced: [ equation: see text] for x, y epsilon int dom (Phi) subset Rn, and for alpha in R, with Dphi(+/-1) defined through taking the limit of alpha. Each member is shown to induce an alpha-independent Riemannian metric, as well as a pair of dual alpha-connections, which are generally nonflat, except for alpha = +/-1. In the latter case, Dphi(+/-1) reduces to the (nonparametric) Bregman divergence, which is representable using phi and its convex conjugate phi* and becomes the canonical divergence for dually flat spaces (Amari, 1982, 1985; Amari & Nagaoka, 2000). This formulation based on convex analysis naturally extends the informationgeometric interpretation of divergence functions (Eguchi, 1983) to allow the distinction between two different kinds of duality: referential duality (alpha <--> -alpha) and representational duality (phi <--> phi*). When applied to (not necessarily normalized) probability densities, the concept of conjugated representations of densities is introduced, so that +/-alpha-connections defined on probability densities embody both referential and representational duality and are hence themselves bidual. When restricted to a finite-dimensional affine submanifold, the natural parameters of a certain representation of densities and the expectation parameters under its conjugate representation form biorthogonal coordinates. The alpha representation (indexed by beta now, beta epsilon [-1, 1]) is shown to be the only measure-invariant representation. The resulting two-parameter family of divergence functionals D(alpha,beta), (alpha, beta) epsilon [-1, 1] x [-1, 1] induces identical Fisher information but bidual alpha-connection pairs; it reduces in form to Amari's alpha-divergence family when alpha = +/-1 or when beta = 1, but to the family of Jensen difference (Rao, 1987) when beta = -1.
Read full abstract