Learning Object-Action Relations from Bimanual Human Demonstration Using Graph Networks

C R G Dreher,Mirko Wachter,Tamim Asfour

doi:10.1109/lra.2019.2949221

Abstract

Recognizing human actions is a vital task for a humanoid robot, especially in domains like programming by demonstration. Previous approaches on action recognition primarily focused on the overall prevalent action being executed, but we argue that bimanual human motion cannot always be described sufficiently with a single action label. We present a system for framewise action classification and segmentation in bimanual human demonstrations. The system extracts symbolic spatial object relations from raw RGB-D video data captured from the robot's point of view in order to build graph-based scene representations. To learn object-action relations, a graph network classifier is trained using these representations together with ground truth action labels to predict the action executed by each hand. We evaluated the proposed classifier on a new RGB-D video dataset showing daily action sequences focusing on bimanual manipulation actions. It consists of 6 subjects performing 9 tasks with 10 repetitions each, which leads to 540 video recordings with 2 hours and 18 minutes total playtime and per-hand ground truth action labels for each frame. We show that the classifier is able to reliably identify (action classification macro F1-score of 0.86) the true executed action of each hand within its top 3 predictions on a frame-by-frame basis without prior temporal action segmentation.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Learning Object-Action Relations from Bimanual Human Demonstration Using Graph Networks

Abstract

Talk to us

Similar Papers

More From: IEEE Robotics and Automation Letters

Lead the way for us

Journal: IEEE Robotics and Automation Letters	Publication Date: Nov 1, 2019
Citations: 50

Similar Papers

Space-time correlation filters for human action detection
Joseph A Fernandez ... B V K Vijaya Kumar
-
Joseph A Fernandez, et. al.Joseph A Fernandez ... B V K Vijaya Kumar
19 Mar 2013
19 Mar 2013

Principles of Digital Video Coding
Harilaos Koumaras ... Drakoulis Martakos
-
Harilaos Koumaras, et. al.Harilaos Koumaras ... Drakoulis Martakos
01 Jan 2009
01 Jan 2009

Temporal cues enhanced multimodal learning for action recognition in RGB-D videos
Dan Liu ... Jianwei Zhang
Neurocomputing | VOL. 594
Dan Liu, et. al.Dan Liu ... Jianwei Zhang
17 May 2024
Neurocomputing | VOL. 594

Manual assembly actions segmentation system using temporal-spatial-contact features
Zengxin Kang ... Zhongyi Chu
Robotic Intelligence and Automation | VOL. 43
Zengxin Kang, et. al.Zengxin Kang ... Zhongyi Chu
21 Aug 2023
Robotic Intelligence and Automation | VOL. 43

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Learning Object-Action Relations from Bimanual Human Demonstration Using Graph Networks

Abstract

Talk to us

Similar Papers

More From: IEEE Robotics and Automation Letters