Abstract

This paper demonstrates how a Markov decision process (MDP) can be approximated to generate a policy bound, i.e., a function that bounds the optimal policy from below or from above for all states. We present sufficient conditions for several computationally attractive approximations to generate rigorous policy bounds. These approximations include approximating the optimal value function, replacing the original MDP with a separable approximate MDP, and approximating a stochastic MDP with its deterministic counterpart. An example from the field of fisheries management demonstrates the practical applicability of the results.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call