Concentration inequalities for the empirical distribution of discrete distributions: beyond the method of types

Jay Mardia,Robert D Nowak,Tsachy Weissman,Ervin Tánczos,Jiantao Jiao

doi:10.1093/imaiai/iaz025

Abstract

Abstract We study concentration inequalities for the Kullback–Leibler (KL) divergence between the empirical distribution and the true distribution. Applying a recursion technique, we improve over the method of types bound uniformly in all regimes of sample size $n$ and alphabet size $k$, and the improvement becomes more significant when $k$ is large. We discuss the applications of our results in obtaining tighter concentration inequalities for $L_1$ deviations of the empirical distribution from the true distribution, and the difference between concentration around the expectation or zero. We also obtain asymptotically tight bounds on the variance of the KL divergence between the empirical and true distribution, and demonstrate their quantitatively different behaviours between small and large sample sizes compared to the alphabet size.

Full Text