Borel Spaces Research Articles

In this paper we study discrete-time Markov decision processes in Borel spaces with a finite number of constraints and with unbounded rewards and costs. Our aim is to provide a simple method to compute constrained optimal control policies when the payoff functions and the constraints are of either: infinite-horizon discounted type and average (a.k.a. ergodic) type. To deduce optimality results for the discounted case, we use the Lagrange multipliers method that rewrites the original problem (with constraints) into a parametric family of discounted unconstrained problems. Based on the dynamic programming technique as long with a simple use of elementary differential calculus, we obtain both suitable Lagrange multipliers and a family of control policies associated to these multipliers, this last family becomes optimal for the original problem with constraints. We next apply the vanishing discount factor method in order to obtain, in a straightforward way, optimal control policies associated to the average problem with constraints. Finally, to illustrate our results, we provide a simple application to linear–quadratic systems (LQ-systems).

We consider stochastic optimal control models with Borel spaces and universally measurable policies. For such models the standard policy iteration is known to have difficult measurability issues and cannot be carried out in general. We present a mixed value and policy iteration method that circumvents this difficulty. The method allows the use of stationary policies in computing the optimal cost function in a manner that resembles policy iteration. It can also be used to address similar difficulties of policy iteration in the context of upper and lower semicontinuous models. We analyze the convergence of the method in infinite horizon total cost problems for the discounted case where the one-stage costs are bounded and for the undiscounted case where the one-stage costs are nonpositive or nonnegative. For undiscounted total cost problems with nonnegative one-stage costs, we also give a new convergence theorem for value iteration that shows that value iteration converges whenever it is initialized with a function that is above the optimal cost function and yet bounded by a multiple of the optimal cost function. This condition resembles Whittle’s bridging condition and is partly motivated by it. The theorem is also partly motivated by a result of Maitra and Sudderth that showed that value iteration, when initialized with the constant function zero, could require a transfinite number of iterations to converge. We use the new convergence theorem for value iteration to establish the convergence of our mixed value and policy iteration method for the nonnegative cost case.

Borel Spaces Research Articles

Articles published on Borel Spaces

Zero-Sum Markov Games with Random State-Actions-Dependent Discount Factors: Existence of Optimal Strategies

From Infinite to Finite Programs: Explicit Error Bounds with Applications to Approximate Dynamic Programming

Nowak's Theorem on Probability Measures Induced by Strategies Revisited

On the Asymptotic Optimality of Finite Approximations to Markov Decision Processes with Borel Spaces

Estimate and approximate policy iteration algorithm for discounted Markov decision models with bounded costs and Borel spaces

Zero-Sum Discounted Reward Criterion Games for Piecewise Deterministic Markov Processes

Mean-variance optimality for semi-Markov decision processes under first passage criteria

Discrete-time zero-sum Markov games with first passage criteria

Continuous-Time Markov Decision Processes with Exponential Utility

A perturbation approach to a class of discounted approximate value iteration algorithms with borel spaces

Asymptotic theory of path spaces of graded graphs and its applications

Constrained Markov decision processes in Borel spaces: from discounted to average optimality

Computable approximations for continuous-time Markov decision processes on Borel spaces based on empirical measures

A Mixed Value and Policy Iteration Method for Stochastic Control with Universally Measurable Policies

Another set of verifiable conditions for average Markov decision processes with Borel spaces

Markov control models with unknown random state–action-dependent discount factors

Strong average optimality criterion for continuous-time Markov decision processes

On the First Passage $g$-Mean-Variance Optimality for Discounted Continuous-Time Markov Decision Processes

Constrained stochastic games with the average payoff criteria

Mean-Variance Problems for Finite Horizon Semi-Markov Decision Processes

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Borel Spaces Research Articles

Articles published on Borel Spaces

Zero-Sum Markov Games with Random State-Actions-Dependent Discount Factors: Existence of Optimal Strategies

From Infinite to Finite Programs: Explicit Error Bounds with Applications to Approximate Dynamic Programming

Nowak's Theorem on Probability Measures Induced by Strategies Revisited

On the Asymptotic Optimality of Finite Approximations to Markov Decision Processes with Borel Spaces

Estimate and approximate policy iteration algorithm for discounted Markov decision models with bounded costs and Borel spaces

Zero-Sum Discounted Reward Criterion Games for Piecewise Deterministic Markov Processes

Mean-variance optimality for semi-Markov decision processes under first passage criteria

Discrete-time zero-sum Markov games with first passage criteria

Continuous-Time Markov Decision Processes with Exponential Utility

A perturbation approach to a class of discounted approximate value iteration algorithms with borel spaces

Asymptotic theory of path spaces of graded graphs and its applications

Constrained Markov decision processes in Borel spaces: from discounted to average optimality

Computable approximations for continuous-time Markov decision processes on Borel spaces based on empirical measures

A Mixed Value and Policy Iteration Method for Stochastic Control with Universally Measurable Policies

Another set of verifiable conditions for average Markov decision processes with Borel spaces

Markov control models with unknown random state–action-dependent discount factors

Strong average optimality criterion for continuous-time Markov decision processes

On the First Passage $g$-Mean-Variance Optimality for Discounted Continuous-Time Markov Decision Processes

Constrained stochastic games with the average payoff criteria

Mean-Variance Problems for Finite Horizon Semi-Markov Decision Processes