Abstract
Video person retrieval aims at matching video clips of the same person across non-overlapping camera views, where video sequences contain more comprehensive information, e.g., temporal cues. How to extract useful temporal cues is the key to the success of a video person retrieval system. Gait, as a unique biometric modality indicating the way people walk, contains informative temporal information. To date, it is not clear how to fully utilize gait to boost the performance of video person retrieval. In this paper, to validate whether gait could help retrieve person in videos, we build a two-stream architecture, named appearance-gait network (AGNet), to jointly learn the appearance features and gait features from RGB video clips and silhouette video clips. We further explore how to fully utilize gait features to enhance the video feature representation. Specifically, we propose an appearance-gait attention module (AGA) to fuse a discriminative feature representation for the person retrieval task. Furthermore, to eliminate the requirement of silhouette video clips during inference, we propose a simple yet effective appearance-gait distillation module (AGD) which transfers the gait knowledge to appearance stream. As such, we are able to perform the enhanced video person retrieval without silhouette video clips, which makes the inference more flexible and practical. To the best of our knowledge, our work is the first to successfully introduce such appearance-gait knowledge distillation design for video person retrieval. We verify the effectiveness of the proposed methods on two large-scale challenging benchmarks of MARS and DukeMTMC-VideoReID. Extensive experiments demonstrate superior or comparable performance compared to the state-of-the-art methods while being much simpler. Source code is publicly available at <uri xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">https://github.com/yangyangkiki/Gait-Assisted-Video-Reid</uri> .
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE Transactions on Circuits and Systems for Video Technology
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.