Abstract

Motivated by the study of deceptive strategies, this paper considers the problems of detecting an agent's objective from its partial path and determining an optimal environment to enable such detection. We focus on a scenario where the agent's objective is to reach a particular target state from a set of potential targets, while an observer seeks to correctly identify such a state prior to the agent reaching it. In order to quantify the predictability of the agent's target given the observed path, we introduce the notion of target entropy, where higher entropy implies lower target predictability. The problem of optimal environment design, i.e., optimal target placement, then becomes a minimax problem with target entropy as an objective function. Under the assumption that the agent chooses its path towards its target maximally unpredictably, we consider models of the agent's motion on both discrete and continuous state spaces. Using dynamic programming, we establish a simple way of computing target entropy for the discrete state space. In a continuous state space, we obtain a formula for target entropy by employing geometrical arguments on volumes of hypersimplices. Additionally, we provide an algorithm yielding an optimal environment in a discrete state space, discuss its computational complexity, and provide a computationally simpler approximation that yields a locally optimal environment. We validate our results on a previously developed model of deceptive agent motion.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call