Using Markov decision processes to optimize a nonlinear functional of the final distribution, with manufacturing applications

E J Collins

doi:10.1007/978-3-642-59105-1_3

Using Markov decision processes to optimize a nonlinear functional of the final distribution, with manufacturing applications

E J Collins

Open Access

https://doi.org/10.1007/978-3-642-59105-1_3

Copy DOI

Publication Date: Jan 1, 1997

Citations: 2

Affiliation: University of Bristol

#Finite Horizon Markov Decision Processes #Markov Decision Processes + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

We consider manufacturing problems which can be modelled as finite horizon Markov decision processes for which the effective reward function is either a strictly concave or strictly convex functional of the distribution of the final state. Reward structures such as these often arise when penalty factors are incorporated into the usual expected reward objective function. For convex problems there is a Markov deterministic policy which is optimal, but for concave problems we usually have to consider the larger class of Markov randomised policies. In the natural formulation these problems cannot be solved directly by dynamic programming. We outline alternative iterative schemes for solution and show how they can be applied in a specific manufacturing example.

Full Text