Abstract

Understanding the fine-grained temporal structure of human actions and its semantic interpretation is beneficial to many real-world tasks, such as sports movements, rehabilitation exercises, and daily-life activities analysis. Current action segmentation methods mainly rely on deep neural networks to derive feature embedding of actions from motion data, while works on analyzing human actions in fine-granularity are still lacking due to the lack of clear and generic definitions of subactions and related datasets. On the other hand, the motion representations obtained in current action segmentation methods lack semantic or mathematical interpretability that can be used to evaluate action/subaction similarity in quantitative motion analysis. Toward the goal of fine-grained, interpretable, scalable, and efficient action segmentation, we propose a novel unsupervised action segmentation and distributed representation framework based on intuitive motion primitives defined on pose data. Metrics for comprehensive evaluation of the unsupervised fine-grained action segmentation task performance are proposed, and both public and self-constructed datasets are adopted in the experiments. The results show that the proposed method has good performance and generality across different subjects, datasets, and application scenarios.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call