Learning Category-Level Generalizable Object Manipulation Policy Via Generative Adversarial Self-Imitation Learning From Demonstrations

Hao Shen,Weikang Wan,He Wang

doi:10.1109/lra.2022.3196122

Abstract

Generalizable object manipulation skills are critical for intelligent and multi-functional robots to work in real-world complex scenes. Despite the recent progress in reinforcement learning, it is still very challenging to learn a generalizable manipulation policy that can handle a category of geometrically diverse articulated objects. In this work, we tackle this category-level object manipulation policy learning problem via imitation learning in a task-agnostic manner, where we assume no handcrafted dense rewards but only a terminal reward. Given this novel and challenging problem setting, we identify several key issues that fail the previous imitation learning algorithms and hinder the generalization to unseen instances. We then propose several critical techniques, including generative adversarial self-imitation learning from demonstrations, progressive growing of discriminator, and instance-balancing for expert buffer, that pinpoints and tackles these issues and can benefit category-level manipulation policy learning regardless of the tasks. Our experiments on ManiSkill benchmarks demonstrate remarkable improvements on all tasks, compared to the popular imitation learning algorithm, GAIL. This work won first place of the “no external annotation” track of ManiSkill Challenge 2021.

Full Text