Estimation of KL Divergence: Optimal Minimax Rate

Yuheng Bu,Venugopal V Veeravalli,Shaofeng Zou,Yingbin Liang

doi:10.1109/tit.2018.2805844

Yuheng Bu, Venugopal V Veeravalli + Show 2 more

Open Access

https://doi.org/10.1109/tit.2018.2805844

Copy DOI

Abstract

The problem of estimating the Kullback–Leibler divergence $D(P\|Q)$ between two unknown distributions $P$ and $Q$ is studied, under the assumption that the alphabet size $k$ of the distributions can scale to infinity. The estimation is based on $m$ independent samples drawn from $P$ and $n$ independent samples drawn from $Q$ . It is first shown that there does not exist any consistent estimator that guarantees asymptotically small worst case quadratic risk over the set of all pairs of distributions. A restricted set that contains pairs of distributions, with density ratio bounded by a function $f(k)$ is further considered. An augmented plug-in estimator is proposed, and its worst case quadratic risk is shown to be within a constant factor of $(({k}/{m})+({kf(k)}/{n}))^{2}+({\log ^{2}\!\!f(k)}/{m})+({f(k)}/{n})$ , if $m$ and $n$ exceed a constant factor of $k$ and $kf(k)$ , respectively. Moreover, the minimax quadratic risk is characterized to be within a constant factor of $((k/(m \log k))+(k f(k)/(n \log k)))^{2}+({\log ^{2}\!\!f(k)}/{m})+({f(k)}/{n})$ , if $m$ and $n$ exceed a constant factor of $k/\log (k)$ and $kf(k)/\log k$ , respectively. The lower bound on the minimax quadratic risk is characterized by employing a generalized Le Cam’s method. A minimax optimal estimator is then constructed by employing both the polynomial approximation and the plug-in approaches.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Transactions on Information Theory	Publication Date: Apr 1, 2018
Citations: 86	License type: publisher-specific-oa

R Discovery Prime

R Discovery Prime

Estimation of KL Divergence: Optimal Minimax Rate

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Information Theory

Lead the way for us

Similar Papers

Estimation of KL divergence between large-alphabet distributions
Yuheng Bu ... Venugopal V Veeravalli
-
Yuheng Bu, et. al.Yuheng Bu ... Venugopal V Veeravalli
01 Jul 2016
01 Jul 2016

Generalized Good-Turing Improves Missing Mass Estimation
Amichai Painsky
Journal of the American Statistical Association | VOL. 118
Amichai PainskyAmichai Painsky
31 Jan 2022
Journal of the American Statistical Association | VOL. 118

MMSE Bounds for Additive Noise Channels Under Kullback–Leibler Divergence Constraints on the Input Distribution
Alex Dytso ... Abdelhak M Zoubir
IEEE Transactions on Signal Processing | VOL. 67
Alex Dytso, et. al.Alex Dytso ... Abdelhak M Zoubir
15 Dec 2019
IEEE Transactions on Signal Processing | VOL. 67

Universal and Composite Hypothesis Testing via Mismatched Divergence
Jayakrishnan Unnikrishnan ... Dayu Huang
IEEE Transactions on Information Theory | VOL. 57
Jayakrishnan Unnikrishnan, et. al.Jayakrishnan Unnikrishnan ... Dayu Huang
01 Mar 2011
IEEE Transactions on Information Theory | VOL. 57

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Estimation of KL Divergence: Optimal Minimax Rate

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Information Theory