Skeleton-based sign language recognition (SLR) is a challenging research area mainly due to the fast and complex hand movement. Currently, graph convolution networks (GCNs) have been employed in skeleton-based SLR and achieved remarkable performance. However, existing GCN-based SLR methods suffer from a lack of explicit attention to hand topology which plays an important role in the sign language representation. To address this issue, we propose a novel hand-aware graph convolution network (HA-GCN) to focus on hand topological relationships of skeleton graph. Specifically, a hand-aware graph convolution layer is designed to capture both global body and local hand information, in which two sub-graphs are defined and incorporated to represent hand topology information. In addition, in order to eliminate the over-fitting problem, an adaptive DropGraph is designed in construction of hand-aware graph convolution block to remove the spatial and temporal redundancy in the sign language representation. With the aim to further improve the performance, the joints information, bones, together with their motion information are simultaneously modeled in a multi-stream framework. Extensive experiments on the two open-source datasets, AUTSL and INCLUDE, demonstrate that our proposed algorithm outperforms the state-of-the-art with a significant margin. Our code is available at https://github.com/snorlaxse/HA-SLR-GCN.
Read full abstract