Abstract

In this paper we consider finite-stage stochastic optimization problems of utility criterion, which is the stochastic evaluation of associative reward through a utility function. We optimize the expected value of a utility criterion not in the class of Markov policies but in the class of general policies. We show that, by expanding the state space, an invariant imbedding approach yields an recursive relation between two adjacent optimal value functions. We show that the utility problem with a general policy is equivalent to a terminal problem with a Markov policy on the augmented state space. Finally it is shown that the utility problem has an optimal policy in the class of general policies on the original state space.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.