Abstract

We propose a new approach to solving dynamic decision problems with unbounded rewards based on the transformations used in Q-learning. In our case, however, the objective of the transform is not learning. Rather, it is to convert an unbounded dynamic program into a bounded one. The approach is general enough to handle problems for which existing methods struggle, and yet simple relative to other techniques and accessible for applied work. We show by example that a variety of common decision problems satisfy our conditions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call