Abstract
Human hand gesture is an efficient way of communication for Human-computer interaction (HCI) applications. To this end, one of the main requirements is an automatic hand pose estimation. Existing methods usually explore spatial relationships among hand joints in a single image to estimate the 3D hand pose. By doing so, the temporal constraints among hand poses are under-investigated. In this paper, we propose SST-GCN (Structure aware Spatial-Temporal Graphic Convolutional Network) that incorporates both spatial dependencies and temporal consistencies to improve 3D hand pose estimation results. Our method bases on an existing spatial-temporal GCN for 3D pose estimation. In addition, we introduce a new loss function that takes geometric constraints of hand structure into account. Our proposed method takes a 2D hand pose as an input to estimates the 3D hand pose. Finally, we evaluate our method on the First-Person Hand Action Benchmark (FPHAB) dataset. The experimental results show that the proposed method gives promising results in comparison with the original ST-GCN network.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have