A motion-aware and temporal-enhanced Spatial–Temporal Graph Convolutional Network for skeleton-based human action segmentation

Shurong Chai,Rahul Kumar Jain,Jiaqing Liu,Shiyu Teng,Tomoko Tateyama,Yinhao Li,Yen-Wei Chen

doi:10.1016/j.neucom.2024.127482

Shurong Chai, Rahul Kumar Jain + Show 5 more

https://doi.org/10.1016/j.neucom.2024.127482

Copy DOI

Export

Save

Cite

Journal: Neurocomputing	Publication Date: Mar 6, 2024
Citations: 1

Abstract
Full-Text
Similar Papers

Abstract

Listen

Action segmentation task is an important approach for understanding the actions from the video. Most of the conventional action recognition tasks can recognize only a single action from a given input video, thus we need to input a pre-trimmed video containing only one type of action. In contrast, temporal action segmentation (TAS) aims to segment a temporally untrimmed video sequence by time. Consequently, it has wider application prospects in various fields. Previously proposed TAS-based methods use only RGB color video as input to segment the actions, but RGB video is not robust against diverse backgrounds. Whereas skeleton-based features are more resilient as they do not incorporate any background information but there has been limited research exploring this feature modality. To this end, we propose a motion-aware and temporal-enhanced spatial–temporal graph convolutional network for the skeleton-based human action segmentation. Our framework contains a motion-aware module, multi-scale temporal convolutional network, temporal-enhanced graph convolutional network module and a refinement module. Our method can efficiently capture the motion information and long-range dependencies using skeleton features while improving temporal modeling. We have conducted experiments using four publicly available datasets to demonstrate the effectiveness of our introduced method. The code is available at https://github.com/11yxk/openpack.

Full Text