Abstract

In person re-identification (Re-ID), increasing the diversity of pedestrian features can improve recognition accuracy. In standard convolutional neural networks (CNNs), the receptive fields of neurons in each layer are designed to have the same size. Therefore, in complex pedestrian re-identification tasks, the standard CNNs extract local features but are unable to obtain satisfactory results for global features extracted from the images. Local feature learning methods are helpful for obtaining more abundant features, which focus on the most significant local features and ignore the correlations between features of various parts of the human body. To solve the above problems, a new multiscale reference-aided attentive feature aggregation (MS-RAFA) mechanism is proposed, consisting of three main modules. First, to extract the most significant local features and strengthen the correlations between the features of various parts of the human body, an autoselect module (ASM) is designed, an attentional mechanism that can stack the structural information and spatial relations to form new features. Then, to realize multiscale feature fusion of the multiple output branches of the backbone network and increase feature diversity, we propose a multilayer feature fusion module (MFFM), which enables the model to mine the features hidden by salient features and to learn features better. Finally, to supervise the MFFM and make the network obtain better recognition features, we propose a multiple supervision mechanism. Finally, experimental results demonstrate that our proposed method outperforms the state-of-the-art methods on three large-scale datasets.

Highlights

  • Person re-identification (Re-ID), which forms the core of video surveillance technology, implements image processing, computer vision, pattern recognition, machine learning and other related technologies to solve cross-camera and crossscene pedestrian retrieval problems

  • To address the above deficiencies, in this paper we present a new multiscale reference-aided attentive feature aggregation (MS-RAFA) mechanism that enables the network to adaptively extract all potential salient pedestrian features

  • PROPOSED METHOD We propose a new multiscale reference-aided attentive feature aggregation mechanism (MS-RAFA), which includes three main modules: the autoselect module (ASM), multilayer feature fusion module (MFFM) and the multiple supervision module

Read more

Summary

Introduction

Person re-identification (Re-ID), which forms the core of video surveillance technology, implements image processing, computer vision, pattern recognition, machine learning and other related technologies to solve cross-camera and crossscene pedestrian retrieval problems. Visual feature-based recognition methods are more reliable than those based on biological information, such as carrying items or clothing, and can be used more reliably in Re-ID [1,2,3,4]. With the popularity of video capture systems, video-based Re-ID achieves more robust performance. Many scholars have developed improved pedestrian re-recognition methods and achieved very good results. In cases involving different visual points, low image resolution, illumination changes, unconstrained attitude change and occlusion, the recognition effect is not ideal [5,6,7,8,9]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call