At present, the existing gait recognition systems are focusing on developing methods to extract robust gait feature from silhouette images and they indeed achieved great success. However, gait can be sensitive to appearance features such as clothing and carried items. Compared with appearance-based method, model-based gait recognition is promising due to the robustness against these variations. In recent years, with the development of human pose estimation, the difficulty of model-based gait recognition methods has been mitigated. In this paper, to resist the increase of subjects and views variation, local features are built and a siamese network is proposed to maximize the distance of samples from the same subject. We leverage recent advances in action recognition to embed human pose sequence to a vector and introduce Spatial-Temporal Graph Convolution Blocks (STGCB) which has been commonly used in action recognition for gait recognition. Experiments on the very large population dataset named OUMVLP-Pose and the popular dataset, CASIA-B, show that our method archives some state-of-the-art (SOTA) performances in model-based gait recognition. The code and models of our method are available at https://github.com/timelessnaive/Gait-for-Large-Dataset after being accepted.
Read full abstract