Learning the semantics of object–action relations by observation

Eren Erdal Aksoy,Kejun Ning,Johannes Dörr,Babette Dellen,Alexey Abramov,Florentin Wörgötter

doi:10.1177/0278364911410459

Abstract

Recognizing manipulations performed by a human and the transfer and execution of this by a robot is a difficult problem. We address this in the current study by introducing a novel representation of the relations between objects at decisive time points during a manipulation. Thereby, we encode the essential changes in a visual scenery in a condensed way such that a robot can recognize and learn a manipulation without prior object knowledge. To achieve this we continuously track image segments in the video and construct a dynamic graph sequence. Topological transitions of those graphs occur whenever a spatial relation between some segments has changed in a discontinuous way and these moments are stored in a transition matrix called the semantic event chain (SEC). We demonstrate that these time points are highly descriptive for distinguishing between different manipulations. Employing simple sub-string search algorithms, SECs can be compared and type-similar manipulations can be recognized with high confidence. As the approach is generic, statistical learning can be used to find the archetypal SEC of a given manipulation class. The performance of the algorithm is demonstrated on a set of real videos showing hands manipulating various objects and performing different actions. In experiments with a robotic arm, we show that the SEC can be learned by observing human manipulations, transferred to a new scenario, and then reproduced by the machine.

Highlights

It has long been known that raw observation and naive copying are insufficient to execute an action by a robot
In this paper we have introduced a novel representation for manipulations, called the semantic event chain (SEC), which focuses on the relations between objects in a scene
The representation generates column vectors in a matrix where every transition between neighboring vectors can be interpreted as an action rule, which defines which object relations have changed in the scene

Summary

Introduction

It has long been known that raw observation and naive copying are insufficient to execute an action by a robot. The last two aspects (5) and (6) would allow human access, very practically, for debugging and improving the algorithm(s), and for being able to better understand and possibly interact with the artificial system and for entering model-based knowledge To arrive at such a representation is a very difficult problem and commonly one uses models of objects (and hands) and trajectories to encode a manipulation (see the section for a discussion of the relevant literature). In this study it is our goal to introduce the so-called ‘semantic event chain’ (SEC) as a novel, generic encoding scheme for manipulations, which, to a large degree, fulfills the above introduced requirements (grounded, learnable, invariant, compressed, and human-comprehensible) We show that these SECs can be used to allow an agent to learn by observation to distinguish between different manipulations and to classify parts of the observed scene. Parts of this study have been published at a conference (Aksoy et al 2010)

Related work

Recognition of manipulations

Recognition of human motion patterns

Object recognition and the role of context

Overview of the algorithm

Discussion

Related approaches

Features and problems of the algorithm

Affordances and object–action complexes

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: The International Journal of Robotics Research	Publication Date: Aug 1, 2011
Citations: 165	License type: cc-by

R Discovery Prime

R Discovery Prime

Learning the semantics of object–action relations by observation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: The International Journal of Robotics Research

Lead the way for us

Similar Papers

Testing piaget's ideas on robots: Assimilation and accommodation using the semantics of actions
Eren Erdal Aksoy ... Markus Schoeler
-
Eren Erdal Aksoy, et. al.Eren Erdal Aksoy ... Markus Schoeler
01 Oct 2014
01 Oct 2014

Execution of a dual-object (pushing) action with semantic event chains
Eren Erdal Aksoy ... Minija Tamosiunaite
-
Eren Erdal Aksoy, et. al.Eren Erdal Aksoy ... Minija Tamosiunaite
01 Oct 2011
01 Oct 2011

Model-free incremental learning of the semantics of manipulation actions
Eren Erdal Aksoy ... Florentin Wörgötter
Robotics and Autonomous Systems | VOL. 71
Eren Erdal Aksoy, et. al.Eren Erdal Aksoy ... Florentin Wörgötter
18 Nov 2014
Robotics and Autonomous Systems | VOL. 71

Semantic Decomposition and Recognition of Long and Complex Manipulation Action Sequences
Eren Erdal Aksoy ... Adil Orhan
International Journal of Computer Vision | VOL. 122
Eren Erdal Aksoy, et. al.Eren Erdal Aksoy ... Adil Orhan
05 Oct 2016
International Journal of Computer Vision | VOL. 122

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Learning the semantics of object–action relations by observation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: The International Journal of Robotics Research