Robotic Learning From Advisory and Adversarial Interactions Using a Soft Wrist

Masashi Hamaya,Chisato Nakashima,Yoshiya Shibata,Yoshihisa Ijiri,Felix Von Drigalski,Kazutoshi Tanaka

doi:10.1109/lra.2021.3067232

Masashi Hamaya, Chisato Nakashima + Show 4 more

Open Access

https://doi.org/10.1109/lra.2021.3067232

Copy DOI

Journal: IEEE Robotics and Automation Letters	Publication Date: Apr 1, 2021
Citations: 7	License type: CC BY 4.0

Affiliation: Omron (Japan)

Abstract

In this letter, we developed a novel learning framework from physical human-robot interactions. Owing to human domain knowledge, such interactions can be useful for facilitation of learning. However, applying numerous interactions for training data might place a burden on human users, particularly in real-world applications. To address this problem, we propose formulating this as a model-based reinforcement learning problem to reduce errors during training and increase robustness. Our key idea is to develop 1) an advisory and adversarial interaction strategy and 2) a human-robot interaction model to predict each behavior. In the advisory and adversarial interactions, a human guides and disturbs the robot when it moves in the wrong and correct directions, respectively. Meanwhile, the robot tries to achieve its goal in conjunction with predicting the human's behaviors using the interaction model. To verify the proposed method, we conducted peg-in-hole experiments in a simulation and real-robot environment with human participants and a robot, which has an underactuated soft wrist module. The experimental results showed that our proposed method had smaller position errors during training and a higher number of successes than the baselines without any interactions and with random interactions.

Highlights

L EARNING technologies have demonstrated significant potential in various robotic manipulation applications such as, but not limited to, autonomous assemblies [1] and collaborations with humans [2]
Two serious problems remain when applying them to a real robot environment: learning inefficiencies and vulnerability amidst unknown environments
Multi-step interactions, we need to find more sample-efficient methods because the dynamics will be more complex. To deal with these problems, we propose a novel learning framework from human interactions that is more applicable to real-world settings

Summary

Introduction

L EARNING technologies have demonstrated significant potential in various robotic manipulation applications such as, but not limited to, autonomous assemblies [1] and collaborations with humans [2]. Owing to more expressive neural network architectures, recent developments have allowed robots to complete more complex tasks that could not be solved using manually designed controllers. Two serious problems remain when applying them to a real robot environment: learning inefficiencies and vulnerability amidst unknown environments. As the tasks or learning architecture become more complex, the required. Manuscript received October 23, 2020; accepted February 16, 2021. Date of publication March 18, 2021; date of current version April 2, 2021. This letter was recommended for publication by Associate Editor M. Cho upon evaluation of the reviewers’ comments. Cho upon evaluation of the reviewers’ comments. (Corresponding author: Masashi Hamaya.)

Objectives

Methods

Results

Conclusion