Abstract
Infinite horizon Markovian decision processes with Rp-valued additive utilities are considered. The optimization criterion, here, is a pseudo-order preference relation induced by a convex cone in Rp. The state space is a countable set, and the action space is a compact metric space. Certain assumptions on the continuity of the reward vector and the transition probability are made. In this setting, an algorithm improving policies with respect to the chosen preference relation is given. A point-to-set mapping is defined, and optimal policies are characterized by fixed points of the mapping which are maximal in the set of all fixed points.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.