Abstract

As a sub-direction of image retrieval, person re-identification (Re-ID) is usually used to solve the security problem of cross camera tracking and monitoring. A growing number of shopping centers have recently attempted to apply Re-ID technology. One of the development trends of related algorithms is using an attention mechanism to capture global and local features. We notice that these algorithms have apparent limitations. They only focus on the most salient features without considering certain detailed features. People’s clothes, bags and even shoes are of great help to distinguish pedestrians. We notice that global features usually cover these important local features. Therefore, we propose a dual branch network based on a multi-scale attention mechanism. This network can capture apparent global features and inconspicuous local features of pedestrian images. Specifically, we design a dual branch attention network (DBA-Net) for better performance. These two branches can optimize the extracted features of different depths at the same time. We also design an effective block (called channel, position and spatial-wise attention (CPSA)), which can capture key fine-grained information, such as bags and shoes. Furthermore, based on ID loss, we use complementary triplet loss and adaptive weighted rank list loss (WRLL) on each branch during the training process. DBA-Net can not only learn semantic context information of the channel, position, and spatial dimensions but can integrate detailed semantic information by learning the dependency relationships between features. Extensive experiments on three widely used open-source datasets proved that DBA-Net clearly yielded overall state-of-the-art performance. Particularly on the CUHK03 dataset, the mean average precision (mAP) of DBA-Net achieved 83.2%.

Highlights

  • Based on a lot of research on attention mechanism algorithms, after the position attention module was proposed, we found that the position attention module was effective

  • After comparing the hot map results with other algorithms, we found that CPSA paid more attention to the features of people’s clothes, bags and shoes, while other algorithms paid little attention to these features

  • We proposed a novel attention network (DBA-Net)

Read more

Summary

Introduction

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. The Re-ID task aims to search for the most likely images belonging to the same pedestrian in the gallery (candidate image sets). Common challenges include background interference, angle of view change, light intensity, body posture change and occlusion [1]. Algorithms based on deep learning [2,3,4,5,6,7,8] in the Re-ID direction have made significant progress. One of the development trends of related algorithms is using attention mechanism to capture global and local features. Global features can directly represent the changes of appearance and spatial position of an image. Yang et al [4]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call