Abstract

We describe a framework to build distances by measuring the tightness of inequalities and introduce the notion of proper statistical divergences and improper pseudo-divergences. We then consider the Hölder ordinary and reverse inequalities and present two novel classes of Hölder divergences and pseudo-divergences that both encapsulate the special case of the Cauchy–Schwarz divergence. We report closed-form formulas for those statistical dissimilarities when considering distributions belonging to the same exponential family provided that the natural parameter space is a cone (e.g., multivariate Gaussians) or affine (e.g., categorical distributions). Those new classes of Hölder distances are invariant to rescaling and thus do not require distributions to be normalized. Finally, we show how to compute statistical Hölder centroids with respect to those divergences and carry out center-based clustering toy experiments on a set of Gaussian distributions which demonstrate empirically that symmetrized Hölder divergences outperform the symmetric Cauchy–Schwarz divergence.

Highlights

  • We shall introduce two novel families of log-ratio projective gap divergences based on Hölder ordinary and reverse inequalities that extend the Cauchy–Schwarz divergence, study their properties and consider as an application clustering Gaussian distributions: We experimentally show better clustering results when using symmetrized Hölder divergences than using the Cauchy–Schwarz divergence

  • Since Hölder pseudo-divergences (HPDs) is a projective divergence, we compute with respect to the conjugate exponents α and β the Hölder escort divergence (HED): H

  • We report closed-form formulas for the HPD and Hölder proper divergences (HDs) between two distributions belonging to the same exponential family provided that the natural parameter space is a cone or affine

Read more

Summary

Statistical Divergences from Inequality Gaps

An inequality [1] is denoted mathematically by lhs ≤ rhs, where lhs and rhs denote respectively the left-hand-side and right-hand-side of the inequality. Bi-parametric homogeneous inequalities yield corresponding log-ratio projective divergences: Let lhs( p : q) and rhs( p : q) be homogeneous functions of degree k ∈ N (i.e., lhs(λp : λ0 q) = (λλ0 )k lhs( p : q) and rhs(λp : λ0 q) = (λλ0 )k rhs( p : q)); it comes that:. This work is further extended in [14] where Zhang stresses the two different types of duality in information geometry: the referential duality and the representational duality (with the study of the (ρ, τ )-geometry for monotone embeddings) It is well-known that Rényi divergence generalizes the Kullback–Leibler divergence: Rényi divergence is induced by Rényi entropy, which generalizes Shannon entropy, while keeping the important feature of being additive. In [16], a generalization of Rényi divergences is proposed, and its induced geometry is investigated

Pseudo-Divergences and the Axiom of Indiscernibility
Prior Work and Contributions
Organization
Hölder Pseudo-Divergence
Definition
Properness and Improperness
Reference Duality
HPD is a Projective Divergence
Escort Distributions and Skew Bhattacharyya Divergences
Special Case
Limit Cases of Hölder Divergences and Statistical Estimation
Case Study
Approximating Hölder Projective Divergences for Statistical Mixtures
Hölder Centroids and Center-Based Clustering
Hölder Centroids
Clustering Based on Symmetric Hölder Divergences
Findings
Conclusions and Perspectives
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call