Using Human Gaze to Improve Robustness Against Irrelevant Objects in Robot Manipulation Tasks

Heecheol Kim,Yoshiyuki Ohmura,Yasuo Kuniyoshi

doi:10.1109/lra.2020.2998410

Heecheol Kim, Yoshiyuki Ohmura + Show 1 more

Open Access

https://doi.org/10.1109/lra.2020.2998410

Copy DOI

Journal: IEEE Robotics and Automation Letters	Publication Date: Jul 1, 2020
Citations: 13	License type: CC BY 4.0

Affiliation: The University of Tokyo

Abstract

Deep imitation learning enables the learning of complex visuomotor skills from raw pixel inputs. However, this approach suffers from the problem of overfitting to the training images. The neural network can easily be distracted by task-irrelevant objects. In this letter, we use the human gaze measured by a head-mounted eye tracking device to discard task-irrelevant visual distractions. We propose a mixture density network-based behavior cloning method that learns to imitate the human gaze. The model predicts gaze positions from raw pixel images and crops images around the predicted gazes. Only these cropped images are used to compute the output action. This cropping procedure can remove visual distractions because the gaze is rarely fixated on task-irrelevant objects. This robustness against irrelevant objects can improve the manipulation performance of robots in scenarios where task-irrelevant objects are present. We evaluated our model on four manipulation tasks designed to test the robustness of the model to irrelevant objects. The results indicate that the proposed model can predict the locations of task-relevant objects from gaze positions, is robust to task-irrelevant objects, and exhibits impressive manipulation performance especially in multi-object handling.

Highlights

I MITATION learning involves learning a policy by observing expert demonstrations
We proposed the use of eye tracking to improve imitation learning by robots to perform manipulation tasks
An Mixture Density Network (MDN)-based architecture was proposed to learn visual attention and crop images around the predicted gazes to prevent a degradation in performance owing to visual distractions

Summary

INTRODUCTION

I MITATION learning involves learning a policy by observing expert demonstrations. One application of imitation learning is in robotics (e.g., [1]–[4]), because this method offers potential for learning complex policies. In case there are changes in the background (i.e., the advent of task-irrelevant objects), they change the network’s policy output This is because the mapping of the output action from visual features relies on fully connected layers. The acquired gaze position as well as state-action demonstration pairs are used to learn manipulation tasks. As this method discards out-of-gaze objects to change the policy, the policy is robust to such visual distractions as the advent of unseen and new objects. 3) We empirically show that gaze prediction makes the learning policy more robust to visual distractions and improves multi-object manipulation performance The main contributions of this paper are as follows: 1) To the best of our knowledge, this research is the first to use the human gaze to improve imitation learning performance for robot manipulation tasks. 2) We propose using the MDN to predict the human gaze. 3) We empirically show that gaze prediction makes the learning policy more robust to visual distractions and improves multi-object manipulation performance

RELATED WORK

Hardware

Data Processing

BEHAVIOR CLONING WITH GAZE PREDICTION

Mixture Density Network

Model Architecture

Loss Function

Experimental Setup

Assessment of Performance in Terms of Predicting Gaze

Evaluating Performance on Manipulation Tasks

DISCUSSION

Task specification

Model specifications

Evaluation procedure

Findings

Gaze prediction evaluation with various metrics

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Using Human Gaze to Improve Robustness Against Irrelevant Objects in Robot Manipulation Tasks

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Robotics and Automation Letters

Lead the way for us

Similar Papers

Impact of Peripheral Field Loss on the Execution of Natural Actions: A Study With Glaucomatous Patients and Normally Sighted People.
Stéphanie Dive ... Quentin Lenoble
Journal of glaucoma | VOL. 25
Stéphanie Dive, et. al.Stéphanie Dive ... Quentin Lenoble
01 Oct 2016
Journal of glaucoma | VOL. 25

Impact of Wet Macular Degeneration on the Execution of Natural Actions.
Muriel Boucart ... Miguel Thibaut
Investigative ophthalmology & visual science | VOL. 56
Muriel Boucart, et. al.Muriel Boucart ... Miguel Thibaut
29 Oct 2015
Investigative ophthalmology & visual science | VOL. 56

Graspable objects grab attention when the potential for action is recognized.
Todd C Handy ... Sarah Ketay
Nature Neuroscience | VOL. 6
Todd C Handy, et. al.Todd C Handy ... Sarah Ketay
17 Mar 2003
Nature Neuroscience | VOL. 6

両眼網膜像の不一致検出と奥行き推定過程に基づいたサッカード機構注視位置とサッカード間隔の評価
Hiroaki Kudo ... Mitsuho Yamada
The Journal of the Institute of Television Engineers of Japan | VOL. 50
Hiroaki Kudo, et. al.Hiroaki Kudo ... Mitsuho Yamada
01 Jan 1996
The Journal of the Institute of Television Engineers of Japan | VOL. 50

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Using Human Gaze to Improve Robustness Against Irrelevant Objects in Robot Manipulation Tasks

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Robotics and Automation Letters