A new trend in long-range biometrics, gait recognition, is finding application in a number of different fields including video surveillance. Recently, with the increase in robustness of the pose estimator and the presence of various unpredictable factors in realistic gait recognition, skeleton-based methods with higher robustness have emerged to better meet the challenging gait recognition needs. However, existing approaches primarily focus on extracting global skeletal features, neglecting the intricate motion information of local body parts and overlooking inter-limb relationships. Our solution to these challenges is the dynamic local fusion network (GaitDLF), a novel gait neural network for complex environments that includes a detail-aware stream in addition to the previous direct extraction of global skeleton features, which provides an enhanced representation of gait features. To extract discriminative local motion information, we introduce predefined body part assignments for each joint in the skeletal structure. By segmenting and mapping the overall skeleton based on these limb site divisions, limb-level motion features can be obtained. In addition, we will dynamically fuse the motion features from different limbs and enhance the motion feature representation of each limb by global context information and local context information of the limb-level motion features. The ability to extract gait features between individuals can be improved by aggregating local motion features from different body parts. Based on experiments on CASIA-B, Gait3D, and GREW, we show that our model extracts more comprehensive gait features than the state-of-the-art skeleton-based method, demonstrating that our method is better suited to detecting gait in complex environments in the wild than the appearance-based method.
Read full abstract