Abstract

Generative Adversarial Imitation Learning (GAIL) has been successfully applied to imitation learning in control tasks. However, most GAIL-like approaches require complete and high-quality demonstrations that are scarcely available in practice, which leads to unsatisfactory performances. Researches have proposed algorithms for incomplete demonstrations, which, however, are supposed to be effective only when exceptionally high-quality demonstrations are provided. To solve the problem, the Action-Rank Adversarial Imitation Learning (ARAIL) algorithm is introduced to target the issue of incomplete demonstrations. By reconstructing the standard GAIL framework and introducing the ranker model, ARAIL reshapes the reward function from the discriminator and auxiliary information from the ranker. The primary insight is that the ranker makes a better assessment of missing actions, which in turn helps to learn a better policy. We empirically compare our approach with SOTA algorithms on Atari and Mujoco platforms with imitation learning benchmarks, demonstrating that ARAIL improves both performance and robustness on various levels of incompleteness of actions in demonstrations.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.