Abstract

Contraction coefficients are distribution dependent constants that are used to sharpen standard data processing inequalities for f-divergences (or relative f-entropies) and produce so-called “strong” data processing inequalities. For any bivariate joint distribution, i.e., any probability vector and stochastic matrix pair, it is known that contraction coefficients for f-divergences are upper bounded by unity and lower bounded by the contraction coefficient for χ2-divergence. In this paper, we elucidate that the upper bound is achieved when the joint distribution is decomposable, and the lower bound can be achieved by driving the input f-divergences of the contraction coefficients to zero. Then, we establish a linear upper bound on the contraction coefficients of joint distributions for a certain class of f-divergences using the contraction coefficient for χ2-divergence, and refine this upper bound for the salient special case of Kullback-Leibler (KL) divergence. Furthermore, we present an alternative proof of the fact that the contraction coefficients for KL and χ2-divergences are equal for bivariate Gaussian distributions (where the former coefficient may impose a bounded second moment constraint). Finally, we generalize the well-known result that contraction coefficients of stochastic matrices (after extremizing over all possible probability vectors) for all nonlinear operator convex f-divergences are equal. In particular, we prove that the so-called “less noisy” preorder over stochastic matrices can be equivalently characterized by any nonlinear operator convex f-divergence. As an application of this characterization, we also derive a generalization of Samorodnitsky’s strong data processing inequality.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call