Abstract
The literature on continuous-time stochastic optimal control seldom deals with the case of discrete state spaces. In this paper, we provide a general framework for the optimal control of continuous-time Markov chains on finite graphs. In particular, we provide results on the long-term behavior of value functions and optimal controls, along with results on the associated ergodic Hamilton-Jacobi equation.
Highlights
Optimal control is the field of mathematics dealing with the problem of the optimal way to control a dynamical system according to a given optimality criterion
Since the 1950s and the seminal works of Bellman and Pontryagin, the number of successful applications have been so vast, and in so many domains, that optimal control theory can be regarded as one of the major contributions of applied mathematicians in the second half of the 20th century. In spite of their widespread use, it is noteworthy that the theory of optimal control and that of stochastic optimal control have mainly been developed either in continuous time with a continuous state space, with tools coming from variational calculus, Euler-Lagrange equations, Hamilton-Jacobi(-Bellman) equations, the notion of viscosity solutions, etc., or in discrete time, both on discrete and continuous state spaces, with contributions coming from both mathematics and computer science/machine learning
Stochastic optimal control of continuous-time Markov chains on discrete state spaces is rarely tackled in the literature
Summary
Optimal control is the field of mathematics dealing with the problem of the optimal way to control a dynamical system according to a given optimality criterion. These results are elementary – they do not need viscosity solutions – and have already been derived in a similar manner for the more general case of mean field games on graphs (see [8]) In this framework, we derive a result that is absent from the literature in the case of the control of continuous-time Markov chains on discrete state spaces: that of the long-term behavior of the value functions and the optimal controls, i.e. their behavior when the time horizon goes to infinity.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: ESAIM: Control, Optimisation and Calculus of Variations
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.