Multiple stopping time POMDPs: Structural results

Vikram Krishnamurthy,Sujay Bhatt,Anup Aprem

doi:10.1109/allerton.2016.7852218

Multiple stopping time POMDPs: Structural results

Vikram Krishnamurthy, Sujay Bhatt + Show 1 more

https://doi.org/10.1109/allerton.2016.7852218

Copy DOI

Publication Date: Sep 1, 2016

Citations: 21

Affiliation: Cornell University

#Multiple Stopping Problem #Multiple Stopping + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

This paper considers a multiple stopping problem on a Hidden Markov model sample path of infinite horizon; where a reward, dependent on the underlying state, is associated with each stop. The decision maker stops L times to maximize the total expected revenue. The aim is to determine the structure of the optimal multiple stopping policy. The formulation generalizes the classical (single) stopping time Partially Observed Markov Decision (POMDP) problem. Even though the stopping set (in terms of the Bayesian beliefs) is not necessarily convex, we show that is a connected set. The structural results are illustrated using a numerical example.

Full Text