Controlled Markov Chains

Christos G Cassandras,Stéphane Lafortune

doi:10.1007/978-1-4757-4070-7_9

Abstract

In Chapter 7 we considered Markov chains as a means to model stochastic DES for which explicit closed-form solutions can be obtained. Then, in Chapter 8, we saw how special classes of Markov chains (mostly, birth-death chains) can be used to model queueing systems. We pointed out, however, that queueing theory is largely “descriptive” in nature; that is, its main objective is to evaluate the behavior of queueing systems operating under a particular set of rules. On the other hand, we are often interested in “prescriptive” techniques, based on which we can make decisions regarding the “best” way to operate a system and ultimately control its performance. In this chapter, we describe some such techniques for Markov chains. Our main objective is to introduce the framework known as Markov Decision Theory, and to present some key results and techniques which can be used to control DES modeled as Markov chains. At the heart of these techniques is dynamic programming, which has played a critical role in both deterministic and stochastic control theory since the 1960s. The material in this chapter is more advanced than that of previous ones, it involves some results that were published in the research literature fairly recently, and it demands slightly higher mathematical sophistication. The results, however, should be quite gratifying for the reader, as they lead to the solution of some basic problems from everyday life experience, or related to the

Full Text