Abstract
The rate at which Markov decision processes converge as the horizon length increases can be important for computations and judging the appropriateness of models. The convergence rate is commonly associated with the discount factor α. For example, the total value function for a broad set of problems is known to converge 0(αn), i.e., geometrically with the discount factor. But the rate at which the finite horizon optimal policies converge depends on the convergence of the relative value function. (Relative value at a given state is the difference between total value at that state and total value at some fixed reference state.) Relative value convergence in turn depends both on the discount factor and on ergodic properties of the underlying nonhomogeneous Markov chains. We show in particular that for the stationary finite state space compact action space Markov decision problem, the relative value function converges 0((αλ)n) for all λ > r(P), the argument of the subdominant eigenvalue of the optimal infinite horizon policy (assumed unique). Easily obtained bounds for r(P) are also given which are related to those of A. Brauer. Under additional restrictions, policy convergence is shown to be of the same order as relative value convergence, generalizing work of Shapiro, Schweitzer, and Odoni. The same result gives convergence properties for the undiscounted problem and for the case α > 1. If αr(P) > 1 the problem does not converge. As a by-product of the analysis, necessary conditions are given for the relative value function to converge 0((αλ)n), 0 < αλ < 1, for the nonstationary problem.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.