Temporal Correlation Vision Transformer for Video Person Re-Identification

Pengfei Wu,Le Wang,Gang Hua,Changyin Sun,Sanping Zhou

doi:10.1609/aaai.v38i6.28424

Abstract

Video Person Re-Identification (Re-ID) is a task of retrieving persons from multi-camera surveillance systems. Despite the progress made in leveraging spatio-temporal information in videos, occlusion in dense crowds still hinders further progress. To address this issue, we propose a Temporal Correlation Vision Transformer (TCViT) for video person Re-ID. TCViT consists of a Temporal Correlation Attention (TCA) module and a Learnable Temporal Aggregation (LTA) module. The TCA module is designed to reduce the impact of non-target persons by relative state, while the LTA module is used to aggregate frame-level features based on their completeness. Specifically, TCA is a parameter-free module that first aligns frame-level features to restore semantic coherence in videos and then enhances the features of the target person according to temporal correlation. Additionally, unlike previous methods that treat each frame equally with a pooling layer, LTA introduces a lightweight learnable module to weigh and aggregate frame-level features under the guidance of a classification score. Extensive experiments on four prevalent benchmarks demonstrate that our method achieves state-of-the-art performance in video Re-ID.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Temporal Correlation Vision Transformer for Video Person Re-Identification

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Lead the way for us

Similar Papers

Efficient lightweight video person re-identification with online difference discrimination module
Cunyuan Gao ... Jiaqi Zhao
Multimedia Tools and Applications | VOL. 81
Cunyuan Gao, et. al.Cunyuan Gao ... Jiaqi Zhao
30 Jan 2021
Multimedia Tools and Applications | VOL. 81

VRSTC: Occlusion-Free Video Person Re-Identification
Ruibing Hou ... Xinqian Gu
-
Ruibing Hou, et. al.Ruibing Hou ... Xinqian Gu
01 Jun 2019
01 Jun 2019

Iterative Local-Global Collaboration Learning towards One-Shot Video Person Re-Identification.
Meng Liu ... Leigang Qu
IEEE transactions on image processing : a publication of the IEEE Signal Processing Society | VOL. PP
Meng Liu, et. al.Meng Liu ... Leigang Qu
01 Jan 2020
IEEE transactions on image processing : a publication of the IEEE Signal Processing Society | VOL. PP

Situational diversity in video person re-identification: introducing MSA-BUPT dataset
Ruining Zhao ... Fei Su
Complex & Intelligent Systems | VOL. 10
Ruining Zhao, et. al.Ruining Zhao ... Fei Su
23 May 2024
Complex & Intelligent Systems | VOL. 10

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Temporal Correlation Vision Transformer for Video Person Re-Identification

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence