Abstract

This paper concerns discrete-time infinite-horizon stochastic control systems with Borel state and action spaces and universally measurable policies. We study optimization problems on strategic measures induced by the policies in these systems. The results are then applied to risk-neutral and risk-sensitive Markov decision processes to establish the measurability of the optimal value functions and the existence of universally measurable, randomized or nonrandomized, ϵ-optimal policies, for a variety of average cost criteria and risk criteria. We also extend our analysis to a class of minimax control problems and establish similar optimality results under the axiom of analytic determinacy. Funding: This work was supported by grants from DeepMind, the Alberta Machine Intelligence Institute (AMII), and Alberta Innovates-Technology Futures (AITF).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call