Abstract

This work investigates inverse reinforcement learning (RL) for multiagent systems (MAS) defined by Graphical Apprentice Games. These games are solved by a learner MAS finding the unknown cost functions of an expert MAS from its demonstrated behavior. We begin by developing a model-based inverse RL algorithm including two update loops:1) an inner-loop optimal control update and 2) an outer-loop inverse optimal control (IOC) update. We then introduce a model-free inverse RL algorithm that uses online behaviors of the expert and learner MAS without knowing their dynamics. The optimal control and IOC are solved as subproblems in both proposed inverse RL algorithms. The reward functions that the learner MAS finds are proven to be both stabilizing and nonunique. Simulated case studies validate the effectiveness of the proposed inverse RL algorithms.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.