Watch and Act: Learning Robotic Manipulation From Visual Demonstration

Shuo Yang,Hesheng Wang,Jiyu Cheng,Ran Song,Wei Zhang,Yibin Li

doi:10.1109/tsmc.2023.3248324

Abstract

Learning from demonstration holds the promise of enabling robots to learn diverse actions from expert experience. In contrast to learning from observation-action pairs, humans learn to imitate in a more flexible and efficient manner: learning behaviors by simply “watching.” In this article, we propose a “watch-and-act” imitation learning pipeline that endows a robot with the ability of learning diverse manipulations from visual demonstrations. Specifically, we address this problem by intuitively casting it as two subtasks: 1) understanding the demonstration video and 2) learning the demonstrated manipulations. First, a captioning module based on visual change is presented to understand the demonstration by translating the demonstration video into a command sentence. Then, to execute the captioning command, a manipulation module that learns the demonstrated manipulations is built upon an instance segmentation model and a manipulation affordance prediction model. We validate the superiority of the two modules over existing methods separately via extensive experiments and demonstrate the whole robotic imitation system developed based on the two modules in diverse scenarios using a real robotic arm. Supplementary video is available at https://vsislab.github.io/watch-and-act/.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Watch and Act: Learning Robotic Manipulation From Visual Demonstration

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on systems, man, and cybernetics. Part A, Systems and humans : a publication of the IEEE Systems, Man, and Cybernetics Society

Lead the way for us

Journal: IEEE transactions on systems, man, and cybernetics. Part A, Systems and humans : a publication of the IEEE Systems, Man, and Cybernetics Society	Publication Date: Jul 1, 2023
Citations: 2

Similar Papers

Speed meets accuracy: Advanced deep learning for efficient Orientia tsutsugamushi bacteria assessment in RNAi screening
Potjanee Kanchanapiboon ... Panrasee Ritthipravat
Expert Systems with Applications: X | VOL. 22
Potjanee Kanchanapiboon, et. al.Potjanee Kanchanapiboon ... Panrasee Ritthipravat
16 Mar 2024
Expert Systems with Applications: X | VOL. 22

Efficient and precise cell counting for RNAi screening of Orientia tsutsugamushi infection using deep learning techniques
Potjanee Kanchanapiboon ... Panrasee Ritthipravat
Expert Systems with Applications: X | VOL. 21
Potjanee Kanchanapiboon, et. al.Potjanee Kanchanapiboon ... Panrasee Ritthipravat
19 Nov 2023
Expert Systems with Applications: X | VOL. 21

An Instance Segmentation and Clustering Model for Energy Audit Assessments in Built Environments: A Multi-Stage Approach.
Youness Arjoune ... Sai Peri
Sensors (Basel, Switzerland) | VOL. 21
Youness Arjoune, et. al.Youness Arjoune ... Sai Peri
26 Jun 2021
Sensors (Basel, Switzerland) | VOL. 21

Open-set marine object instance segmentation with prototype learning
Xing Hu ... Panlong Li
Signal, image and video processing | VOL. -
Xing Hu, et. al.Xing Hu ... Panlong Li
28 May 2024
Signal, image and video processing | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Watch and Act: Learning Robotic Manipulation From Visual Demonstration

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on systems, man, and cybernetics. Part A, Systems and humans : a publication of the IEEE Systems, Man, and Cybernetics Society