An attentional spatial temporal graph convolutional network with co-occurrence feature learning for action recognition

Dong Tian,Long-Hua Ma,Xiao Chen,Zhe-Ming Lu

doi:10.1007/s11042-020-08611-4

Abstract

Action recognition plays a central role in intelligent surveillance system, game-control, human-computer interaction, and so on. In this work, we design a multi-task framework that improves the recent Spatial-Temporal Graph Convolutional Networks (ST-GCN) for skeleton-based action recognition by introducing the attention mechanism and co-occurrence feature learning. Specifically, we use an attentional branch to pay more attention to more discriminating features and aggregates co-occurrence features from all joints globally in another branch. Additionally, our multi-task framework exploits the inherent correlation between branches to further enhance the classification accuracy and convergence speed. Experiments have been carried out on NTURGB+D and Kinetics human action dataset. The results clearly show that the accuracy of the proposed multi-task framework are distinguishably higher than ST-GCN and other mainstream methods for 3D action recognition.

Full Text