Abstract

Humans regularly interact with their surrounding objects. Such interactions often result in strongly correlated motions between humans and the interacting objects. We thus ask: “Is it possible to infer object properties from skeletal motion alone, even without seeing the interacting object itself?” In this paper, we present a fine-grained action recognition method that learns to infer such latent object properties from human interaction motion alone. This inference allows us to disentangle the motion from the object property and transfer object properties to a given motion. We collected a large number of videos and 3D skeletal motions of performing actors using an inertial motion capture device. We analyzed similar actions and learned subtle differences between them to reveal latent properties of the interacting objects. In particular, we learned to identify the interacting object, by estimating its weight, or its spillability. Our results clearly demonstrate that motions and interacting objects are highly correlated and that related object latent properties can be inferred from 3D skeleton sequences alone, leading to new synthesis possibilities for motions involving human interaction. Our dataset is available at http://vcc.szu.edu.cn/research/2020/IT.html.

Highlights

  • Digitizing and understanding our physical worldManuscript received: 2021-01-22; accepted: 2021-02-24 are important goals of both computer graphics and computer vision

  • We report the object property inferencing accuracy on all eight types of motion

  • The primary goal of this work was to study human interaction motions represented by skeleton sequences

Read more

Summary

Introduction

Digitizing and understanding our physical worldManuscript received: 2021-01-22; accepted: 2021-02-24 are important goals of both computer graphics and computer vision. The available datasets for human activity recognition [5, 6] are RGB-D videos, which in general contain significant occlusions that hamper the extraction of unseen acting skeletons. While these videos can be used to broadly classify different actions [7], we still lack suitable datasets designed for inferring fine-scale variations of object properties. Unlike previous work on action recognition, we analyze similar actions and have to learn subtle differences between actions of the same type that reveal latent properties of interacting objects. Given the skeletal motion of a person walking on a wide path, we would like to synthesize the person’s skeletal motion when walking on a narrow path

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call