STA: Spatial-Temporal Attention for Large-Scale Video-Based Person Re-Identification

Yang Fu,Yunchao Wei,Thomas Huang,Xiaoyang Wang

doi:10.1609/aaai.v33i01.33018287

Abstract

In this work, we propose a novel Spatial-Temporal Attention (STA) approach to tackle the large-scale person reidentification task in videos. Different from the most existing methods, which simply compute representations of video clips using frame-level aggregation (e.g. average pooling), the proposed STA adopts a more effective way for producing robust clip-level feature representation. Concretely, our STA fully exploits those discriminative parts of one target person in both spatial and temporal dimensions, which results in a 2-D attention score matrix via inter-frame regularization to measure the importances of spatial parts across different frames. Thus, a more robust clip-level feature representation can be generated according to a weighted sum operation guided by the mined 2-D attention score matrix. In this way, the challenging cases for video-based person re-identification such as pose variation and partial occlusion can be well tackled by the STA. We conduct extensive experiments on two large-scale benchmarks, i.e. MARS and DukeMTMCVideoReID. In particular, the mAP reaches 87.7% on MARS, which significantly outperforms the state-of-the-arts with a large margin of more than 11.6%.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

STA: Spatial-Temporal Attention for Large-Scale Video-Based Person Re-Identification

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the AAAI Conference on Artificial Intelligence	Publication Date: Jul 17, 2019
Citations: 168

Similar Papers

Deep Fusion Feature Representation Learning With Hard Mining Center-Triplet Loss for Person Re-Identification
Cairong Zhao ... Jun Wu
IEEE Transactions on Multimedia | VOL. 22
Cairong Zhao, et. al.Cairong Zhao ... Jun Wu
07 Feb 2020
IEEE Transactions on Multimedia | VOL. 22

Person re-identification based on CCN feature representations learning
Li Yuan ... Ping-Jun Li
-
Li Yuan, et. al.Li Yuan ... Ping-Jun Li
01 Jul 2017
01 Jul 2017

Nonlinear and robust statistical process monitoring based on variant autoencoders
Weiwu Yan ... Zukui Li
Chemometrics and Intelligent Laboratory Systems | VOL. 158
Weiwu Yan, et. al.Weiwu Yan ... Zukui Li
15 Aug 2016
Chemometrics and Intelligent Laboratory Systems | VOL. 158

Person re-ID while Crossing Different Cameras: Combination of Salient-Gaussian Weighted BossaNova and Fisher Vector Encodings
Mahmoud Mejdoub ... Salma Ksibi
International Journal of Advanced Computer Science and Applications | VOL. 8
Mahmoud Mejdoub, et. al.Mahmoud Mejdoub ... Salma Ksibi
01 Jan 2017
International Journal of Advanced Computer Science and Applications | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

STA: Spatial-Temporal Attention for Large-Scale Video-Based Person Re-Identification

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence