Abstract

Consider that a particle-like agent, affected by exogenous disturbances, seeks to remain as close as possible to a reference point. Its state evolves as a Markov decision process in discrete time and the actuation effort is cost-free. A denied environment within which state measurements must be requested and are costly encloses the reference point. Measurements outside the denied region are provided cost-free without the need for a request. No control is applied in the absence of a measurement. At each time step, the agent has the authority to decide whether to wander like a random walk or to request a measurement and use it to move towards the reference point. This paper investigates measurement request policies that minimize an objective function that comprises the expected mean squared deviation of the agent from the reference point and the cost of requesting a measurement inside the denied region. The goal is to characterize the trade-off between paying to access the state immediately and waiting for a free measurement that occurs when the agent is carried outside the denied region by the accrued effect of the disturbances over time. We show that the analysis of this problem simplifies by recasting it as a renewal reward process, for which the maximum wait time between the most recent renewal and a measurement request parametrizes all policies. Our analysis concerning wait-time optimization enabled us to establish conditions under which any local minimum (if it exists) is also global within a pre-specified interval, thus facilitating the search for a minimizer. Our results are discussed for the cases in which the agent’s loci are the integers or a finite-dimensional Euclidean space.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call