Abstract

We consider the problem of learning the behavior of multiple mobile robots executing fixed trajectories in a common space and possibly interacting with each other in their execution. The mobile robots are observed by a subject robot from a vantage point from which it can observe a portion of their trajectories only. This problem exhibits wide-ranging applications and the specific application we consider here is that of the subject robot who desires to penetrate a simple perimeter patrol by two interacting robots and reach a goal location. Our approach extends single-agent inverse reinforcement learning (IRL) to a multi-robot setting and partial observability, and models the interaction between the mobile robots as equilibrium behavior. IRL provides weights over the features of the robots' reward functions, thereby allowing us to learn their preferences. Subsequently, we derive a Markov decision process based policy for each other robot. We extend a predominant IRL technique and empirically evaluate its performance in our application setting. We show that our approach in the application setting results in significant improvement in the subject's ability to predict the patroller positions at different points in time with a corresponding increase in its successful penetration rate.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call