Abstract

A problem of optimal stopping in a Markov chain whose states are not directly observable is presented. Using the theory of partially observable Markov decision processes, a model which combines the classical stopping problem with sequential sampling at each stage of the decision process is developed. Several results which characterize the optimal expected value function in terms of its parameters are given. An example is given which indicates that the best action to take as a function of the information currently available may not be of the intuitively appealing control limit type. The set of states at which it is optimal to purchase information need not be convex. The expected value of information as a function of the decision maker's knowledge is related to nonmonotone optimal policies.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.