An adaptive cross‐scale transformer based on graph signal processing for person re‐identification

Wei Zhou,Yi Hou,Shijun Xu,Shilin Zhou

doi:10.1049/ipr2.12794

Abstract

AbstractExtracting robust feature representation is one of the key challenges for person re‐identification (ReID) task. Although convolution neural network (CNN)‐based methods have achieved great success, they still cannot handle the part occlusion and misalignment caused by limited receptive field. Recently, pure transformer models have shown its power in the person ReID task. However, current transformer models adopt patches of equal‐scale as input, and cannot solve the problem of cross‐scale interaction properly. To overcome this problem, an adaptive cross‐scale transformer from a perspective of the graph signal, named ACSFormer, is proposed. Specifically, the self‐attention module is first treated as an undirected fully connected graph. And then, “node variation” is introduced as an indicator to adaptively merge neighbourhood tokens. To the best of the authors’ knowledge, their ACSFormer is the first work to attempt to combine pure transformers and graph signal processing in the field of person ReID. Extensive evaluations are conducted on three person ReID datasets to validate the performance of ACSFormer. Experiments demonstrate that this ACSFormer performs on par with state‐of‐the‐art CNN‐based methods and consistently improves transformer‐based baseline, for example, surpassing ViT‐baseline by 2.5%, 2.7% and 4.8% mAP on Market1501, DukeMTMC‐reID and MSMT17, respectively.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

An adaptive cross‐scale transformer based on graph signal processing for person re‐identification

Abstract

Talk to us

Similar Papers

More From: IET Image Processing

Lead the way for us

Journal: IET Image Processing	Publication Date: Mar 24, 2023
License type: CC BY 4.0

Similar Papers

Person re-ID while Crossing Different Cameras: Combination of Salient-Gaussian Weighted BossaNova and Fisher Vector Encodings
Mahmoud Mejdoub ... Salma Ksibi
International Journal of Advanced Computer Science and Applications | VOL. 8
Mahmoud Mejdoub, et. al.Mahmoud Mejdoub ... Salma Ksibi
01 Jan 2017
International Journal of Advanced Computer Science and Applications | VOL. 8

DCR: A Unified Framework for Holistic/Partial Person ReID
Zan Gao ... Shengyong Chen
IEEE Transactions on Multimedia | VOL. 23
Zan Gao, et. al.Zan Gao ... Shengyong Chen
01 Jan 2020
IEEE Transactions on Multimedia | VOL. 23

Person re-identification based on frequency channel attention networks under the surveillance scenario
Shengbo Chen ... Hongchang Zhang
Journal of Physics: Conference Series | VOL. 1966
Shengbo Chen, et. al.Shengbo Chen ... Hongchang Zhang
01 Jul 2021
Journal of Physics: Conference Series | VOL. 1966

GLAD: Global–Local-Alignment Descriptor for Scalable Person Re-Identification
Longhui Wei ... Hantao Yao
IEEE Transactions on Multimedia | VOL. 21
Longhui Wei, et. al.Longhui Wei ... Hantao Yao
01 Apr 2019
IEEE Transactions on Multimedia | VOL. 21

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An adaptive cross‐scale transformer based on graph signal processing for person re‐identification

Abstract

Talk to us

Similar Papers

More From: IET Image Processing