Two-Stage Model-Agnostic Meta-Learning With Noise Mechanism for One-Shot Imitation

Ziye Hu,James Zhiqing Wen,Xusheng Wang,Decheng Zhou,Wei Li,Zhongxue Gan

doi:10.1109/access.2020.3029220

Ziye Hu, James Zhiqing Wen + Show 4 more

Open Access

https://doi.org/10.1109/access.2020.3029220

Copy DOI

Abstract

Given that humans and animals can learn new behaviors in a short time by observing others, the question we need to consider is how to make robots behave like humans or animals, that is, through effective demonstration, robots can quickly understand and learn a new ability. One possible solution is imitation based meta-learning, but most of the related approaches are limited in a particular network structure or a specific task. Particularly, meta-learning methods based on gradient-update are prone to overfit. In this article, we propose a generic meta-learning algorithm that divides the learning process into two independent stages ( skill cloning and skill transfer ) with a noise mechanism which is compatible with any model. The skill cloning stage enables a good understanding of the demonstration, which helps the skill transfer stage when the robot applies the learned experience into new tasks. The experimental results show that our algorithm can alleviate the phenomenon of overfitting by introducing a noise mechanism . Our method not only performs well on the regression task but is significantly better than the existing state-of-the-art one-shot imitation learning methods in the same simulation environments (i.e., simulated pushing and simulated reaching).

Highlights

How does the noise mechanism affect the performance of meta-learning?
DATA ANALYSIS Regarding one-shot imitation learning, we show the experimental results in Fig. 6 and Table 3
Question: How does the noise mechanism affect the performance of meta-learning?

Summary

Introduction

Humans and animals can learn new behaviors quickly by observing or imitating others and can effectively adapt to new environmental changes by using previous knowledge and experience. Machine learning requires a large number of samples for training, while humans need only a small amount of samples to learn new skills and concepts. Humans only need to learn a few examples of cats and dogs to know the differences between their shapes and characteristics, so they can learn to distinguish between cats and dogs. Since the application environment of robots has migrated from simple settings to unstructured and complex environments, it requires a large amount of expert knowledge. We want robots to be adaptable as humans, which is almost impossible to achieve through traditional programming.

Objectives

Methods

Results

Conclusion