Abstract
We investigate the problem of quantifying contraction coefficients of Markov transition kernels in Kantorovich (L1 Wasserstein) distances. For diffusion processes, relatively precise quantitative bounds on contraction rates have recently been derived by combining appropriate couplings with carefully designed Kantorovich distances. In this paper, we partially carry over this approach from diffusions to Markov chains. We derive quantitative lower bounds on contraction rates for Markov chains on general state spaces that are powerful if the dynamics is dominated by small local moves. For Markov chains on Rd with isotropic transition kernels, the general bounds can be used efficiently together with a coupling that combines maximal and reflection coupling. The results are applied to Euler discretizations of stochastic differential equations with non-globally contractive drifts, and to the Metropolis adjusted Langevin algorithm for sampling from a class of probability measures on high dimensional state spaces that are not globally log-concave
Highlights
Convergence bounds for Markov processes in Kantorovich (L1 Wasserstein) distances have emerged as a powerful alternative to more traditional approaches based on the total variation distance [36], spectral gaps and L2 bounds [27, 9, 10], or entropy estimates [27, 9, 1]
In [29], Joulin and Ollivier have shown that strict Kantorovich contractivity of the transition kernel implies bounds for the variance and concentration estimates for ergodic averages of a Markov chain
Pillai and Smith [39] as well as Rudolf and Schweizer [40] have developed a perturbation theory for Markov chains that are contractive in a Kantorovich distance, cf. Huggins and Zou [26] as well as Johndrow and Mattingly [28] for related results
Summary
Convergence bounds for Markov processes in Kantorovich (L1 Wasserstein) distances have emerged as a powerful alternative to more traditional approaches based on the total variation distance [36], spectral gaps and L2 bounds [27, 9, 10], or entropy estimates [27, 9, 1]. Contractivity with respect to the L1 Wasserstein distance based on the Euclidean distance in Rd is sometimes interpreted as non-negative Ricci curvature of the Markov chain w.r.t. this metric [41, 29, 37] This is a strong condition that is often not satisfied in applications. The approach is powerful in situations where the dynamics is dominated by small, local moves This will be demonstrated below for Euler schemes for non-globally contractive stochastic differential equations, as well as for the Metropolis-adjusted Langevin Algorithm (MALA). In these cases, the Ricci curvature condition required in [29] is not satisfied in the standard L1 Wasserstein distance and the construction of an alternative metric is required. Sometimes, related approaches can be used see e.g. [2] for the construction of a contractive distance for Hamiltonian Monte Carlo
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have