Abstract

Recognition of human manipulation actions together with the analysis and execution by a robot is an important issue. Also, perception of spatial relationships between objects is central to understanding the meaning of manipulation actions. Here we would like to merge these two notions and analyze manipulation actions using symbolic spatial relations between objects in the scene. Specifically, we define procedures for extraction of symbolic human-readable relations based on Axis Aligned Bounding Box object models and use sequences of those relations for action recognition from image sequences. Our framework is inspired by the so called Semantic Event Chain framework, which analyzes touching and un-touching events of different objects during the manipulation. However, our framework uses fourteen spatial relations instead of two. We show that our relational framework is able to differentiate between more manipulation actions than the original Semantic Event Chains. We quantitatively evaluate the method on the MANIAC dataset containing 120 videos of eight different manipulation actions and obtain 97% classification accuracy which is 12 % more as compared to the original Semantic Event Chains.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call