Abstract

Graph convolution networks (GCNs) have drawn attention for skeleton-based action recognition. They have achieved remarkable performance by adaptively learning spatial features of human action dynamics. However, the existing methods are limited in temporal sequence modelling of human actions. To give adequate consideration to temporal factors in action modelling, a novel temporal-enhanced graph convolution network is presented. First, a Causal Convolution layer is introduced to ensure no future information leakage at each time step for keeping ordering information of inputs. Second, a novel cross-spatial-temporal graph convolution layer that extends an adaptive graph from the spatial to the temporal domain to capture local cross-spatial-temporal dependencies among joints is presented. Third, a temporal attention layer is designed to enhance the modelling capability of long-range temporal dependencies, helping the network to directly focus on important time steps. Experimental results on three large-scale datasets, NTU-RGB + D, Kinetics-Skeleton, and UAV-Human, indicate that the authors’ network achieves accuracy improvement with better generalisation capability over previous methods. The authors’ code and data are available at https://github.com/xieyulai/TE-GCN.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call