Transformer-Based Attention Network for Vehicle Re-Identification

Jiawei Lian,Yun Wu,Caixia Li,Dahan Wang,Shunzhi Zhu

doi:10.3390/electronics11071016

Jiawei Lian, Yun Wu + Show 3 more

Open Access

https://doi.org/10.3390/electronics11071016

Copy DOI

Journal: Electronics	Publication Date: Mar 24, 2022
Citations: 14	License type: CC BY 4.0

Affiliation: Xiamen University of Technology

Abstract

Vehicle re-identification (ReID) focuses on searching for images of the same vehicle across different cameras and can be considered as the most fine-grained ID-level classification task. It is fundamentally challenging due to the significant differences in appearance presented by a vehicle with the same ID (especially from different viewpoints) coupled with the subtle differences between vehicles with different IDs. Spatial attention mechanisms that have been proven to be effective in computer vision tasks also play an important role in vehicle ReID. However, they often require expensive key-point labels or suffer from noisy attention masks when trained without key-point labels. In this work, we propose a transformer-based attention network (TAN) for learning spatial attention information and hence for facilitating learning of discriminative features for vehicle ReID. Specifically, in contrast to previous studies that adopted a transformer network, we designed the attention network as an independent branch that can be flexibly utilized in various tasks. Moreover, we combined the TAN with two other branches: one to extract global features that define the image-level structures, and the other to extract the auxiliary side-attribute features that are invariant to viewpoint, such as color, car type, etc. To validate the proposed approach, experiments were conducted on two vehicle datasets (the VeRi-776 and VehicleID datasets) and a person dataset (Market-1501). The experimental results demonstrated that the proposed TAN is effective in improving the performance of both the vehicle and person ReID tasks, and the proposed method achieves state-of-the-art (SOTA) perfomance.

Full Text