Abstract

This paper deals with a discrete time Markov decision model with a finite state space, arbitrary action space, and bounded reward function under the average reward criteria. We consider four average reward criteria and prove the existence of persistently nearly optimal strategies in various classes of strategies for models with complete state information. We show that such strategies exist in any class of strategies satisfying the following condition: along any trajectory at different epochs the controller knows different information about the past. Though neither optimal nor stationary nearly optimal strategies may exist, we show that for some nonempty set of states the described nearly optimal strategies may be chosen either stationary or optimal.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.