Abstract
We study a class of Markov Decision Processes (MDPs) under stationary constraints (particularly but not exclusively, chance constraints). Our model focuses on problems where the state space is finite but the control at each state may take real values. It is straightforward to formulate the problem as a constrained nonlinear optimization problem, but solving it requires either projection, penalty or gradient-based methods that may show slow convergence, particularly because this type of problems may not be strictly convex.In this paper we extend results that are known for the discrete control model and show that the solution of a binary randomized control problem is optimal under the assumption that the dependency on the control variable is linear. If the dependency is piecewise linear with several breakpoints, then the randomized problem is not equivalent to the original problem; however we show that a particular solution always exists that is also the solution to the original continuous problem. This special solution takes the form of a two-action control policy for each state. Solving each two-action randomized problem can be done in polynomial time by linear programming (LP). This leads to an efficient solution when the number of states is small, but for a large state space the complexity can be prohibitive. In a restricted yet interesting special case of piecewise linear, multiple actions setting where the actions are linearly ordered, we show that the problem can again be reduced to a single LP.To overcome the computational complexity in the general case, we propose an alternative approach that involves solving a sequence of linear programs modeling the stationary measure and the optimal solution concurrently. We conclude the paper by discussing future extensions of this framework to application settings where the state space is continuous as well.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.