A Convex Optimization Approach to Distributionally Robust Markov Decision Processes With Wasserstein Distance

Insoon Yang

doi:10.1109/lcsys.2017.2711553

Insoon Yang

Open Access

https://doi.org/10.1109/lcsys.2017.2711553

Copy DOI

Journal: IEEE control systems letters	Publication Date: Jul 1, 2017
Citations: 82	License type: publisher-specific, author manuscript

Affiliation: University of Southern California

Abstract

We consider the problem of constructing control policies that are robust against distribution errors in the model parameters of Markov decision processes. The Wasserstein metric is used to model the ambiguity set of admissible distributions. We prove the existence and optimality of Markov policies and develop convex optimization-based tools to compute and analyze the policies. Our methods, which are based on the Kantorovich convex relaxation and duality principle, have the following advantages. First, the proposed dual formulation of an associated Bellman equation resolves the infinite dimensionality issue that is inherent in its original formulation when the nominal distribution has a finite support. Second, our duality analysis identifies the structure of a worst-case distribution and provides a simple decentralized method for its construction. Third, a sensitivity analysis tool is developed to quantify the effect of ambiguity set parameters on the performance of distributionally robust policies. The effectiveness of our proposed tools is demonstrated through a human-centered air conditioning problem.

Full Text