The monitoring and management of farm workers play a crucial role in agricultural production. The real-time monitoring of farm workers can effectively identify them, facilitating standardized management and operation of the farm. Gait, as a unique biometric feature that can be recognized from a distance, is more suitable for identifying individual farm workers walking on the farm at a distance compared to facial and iris features. Despite the increasing research on gait recognition techniques, existing gait recognition models are still lacking in temporal modeling. The gait of farm workers walking through farmland is greatly influenced by the environment, making it crucial to model the temporal information of gait. In this study, we captured videos of 50 farm workers from 11 different perspectives walking in real agricultural fields and formed the Farm Worker Gait Dataset (FWGD). In comparison with existing gait recognition models, we present a multi-scale temporal feature modeling method based on the GaitBase model and obtain a performance improvement on the FWGD dataset. Specifically, we propose an Adaptive Temporal Self-Attention (ATSA) mechanism to extract the most representative features based on temporal contextual temporal relations. Additionally, we utilize a dilated convolutional layer to extract discrete temporal features and employ a Multi-scale Temporal Perception Module (MSTP) with three branches to perceive and aggregate temporal features. With the assistance of MSTP and ATSA, our model significantly improves the discriminability of temporal features, thus enhancing the performance of farm worker gait recognition. Extensive experiments demonstrated that our proposed method outperformed state-of-the-art gait recognition methods on the FWGD dataset.
Read full abstract