Abstract
Imitation learning (IL) is a well-known problem in the field of Markov decision process (MDP), where one is given multiple demonstration trajectories generated by expert(s), and the goal is to replicate the hidden expert-policies so that when the MDP is run independently, it generates trajectories close to the demonstrated ones. IL is one of the most useful tools used in building versatile robots that can learn from examples. This task becomes particularly challenging when the expert exhibits a mixture of behavior modes. Prior work has introduced latent variables to model variations of the expert policy. However, our experiments show that the existing works do not exhibit appropriate imitation of individual modes. To tackle this problem, we first draw inspiration from the well-known classical technique of self-organizing maps (SOMs) and introduce an encoder-free generative model-referred to as the self-organizing generative (SOG) model-for learning multimodal data distributions from samples. We then apply SOG for behavior cloning (BC)-a framework that learns deterministic policies without considering the environment-to accurately distinguish and imitate different modes. Then, we integrate it with generative adversarial IL (GAIL)-a framework that learns policies while considering the environment-to make the learning robust toward compounding errors at unseen states. We show that our method significantly outperforms the state of the art across multiple experiments within the MuJoCo simulator, including locomotion and robotic manipulation tasks.
Submitted Version (Free)
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE transactions on neural networks and learning systems
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.