Extended Global–Local Representation Learning for Video Person Re-Identification

Wanru Song,Yahong Wu,Changhong Chen,Jieying Zheng,Feng Liu

doi:10.1109/access.2019.2937974

Wanru Song, Yahong Wu + Show 3 more

Open Access

PDF Available

https://doi.org/10.1109/access.2019.2937974

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

Recently, person re-identification has become one of the research hotspots in the field of computer vision and has received extensive attention in the academic community. Inspired by the part-based research of image ReID, this paper presents a novel feature learning and extraction framework for video-based person re-identification, namely, the extended global-local representation learning network (E-GLRN). Given a video sequence of a pedestrian, the holistic and local features are simultaneously extracted using the E-GLRN network. Specifically, for the global feature learning, we adopt the channel attention convolutional neural network (CNN) and the bidirectional long short-term memory (Bi-LSTM) networks, which are responsible for introducing a CNN-LSTM module to learn the features of consecutive frames. The local feature learning module relies on the key local information extraction, which is based on the Bi-LSTM networks. In order to obtain the local feature more effectively, our work defines a concept of “the main image group” by selecting three representative frames. The local feature representation of a video is obtained by exploiting the spatial contextual and appearance information of this group. The local and global features extracted in this paper are complementary and further combined into a discriminative and robust feature representation of the video sequence. Extensive experiments are conducted on three video-based ReID datasets, including the iLIDS-VID, PRID2011 and MARS datasets. The experimental results demonstrate that the proposed method outperforms state-of-the-art video-based re-identification approaches.

Highlights

Person re-identification (ReID) is one of the most important and popular fields of computer vision, with tremendous application potential in video surveillance [1]–[5]
Given a video sequence I = {I1, I2, . . . , IK } with K frames, it can be observed that our framework is divided into four parts, including the global feature learning in the video, representative frame extraction for the video, local feature learning in the video and overall feature representation learning
Considering that a query has multiple ground truths, we evaluate the performance on MARS with the average cumulative match characteristic (CMC) curve and the mean average precision

Summary

Introduction

Person re-identification (ReID) is one of the most important and popular fields of computer vision, with tremendous application potential in video surveillance [1]–[5]. Most studies of ReID have sprung up over the past few years. They can be divided into two classes, design-. The associate editor coordinating the review of this article and approving it for publication was Yongming Li. ing a robust metric learning method or developing a discriminative feature. The metric research focus on learning the distance function [6]–[9]. It helps to ensure that the distance between two features came from the same pedestrian is smaller. The feature learning aims at building an effective and discriminative representation to describe pedestrians. For the hand-crafted features, the low-level descriptors, such as color and texture histograms, are widely used in the ReID task. With the extensive application of the convolutional neural network (CNN) in the visual classification task, some work regards ReID as a multi-class

Objectives

Methods

Findings

Conclusion

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2019
Citations: 6	License type: CC BY 4.0

R Discovery Prime

Extended Global–Local Representation Learning for Video Person Re-Identification

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Multimodal multitask deep learning model for Alzheimer’s disease progression detection based on time series data
Shaker El-Sappagh ... Kyung Sup Kwak
Neurocomputing | VOL. 412
Shaker El-Sappagh, et. al.Shaker El-Sappagh ... Kyung Sup Kwak
01 Jun 2020
Neurocomputing | VOL. 412

Automatic gear shift strategy for manual transmission of mine truck based on Bi-LSTM network
Liyong Wang ... Min Xie
Expert Systems With Applications | VOL. 209
Liyong Wang, et. al.Liyong Wang ... Min Xie
03 Aug 2022
Expert Systems With Applications | VOL. 209

Application of Deep Learning for Reservoir Porosity Prediction and Self Organizing Map for Lithofacies Prediction
Mazahir Hussain ... Umar Ashraf
Journal of Applied Geophysics | VOL. 230
Mazahir Hussain, et. al.Mazahir Hussain ... Umar Ashraf
31 Aug 2024
Journal of Applied Geophysics | VOL. 230

A Bidirectional LSTM Prognostics Method Under Multiple Operational Conditions
Cheng-Geng Huang ... Yan-Feng Li
IEEE Transactions on Industrial Electronics | VOL. 66
Cheng-Geng Huang, et. al.Cheng-Geng Huang ... Yan-Feng Li
01 Nov 2019
IEEE Transactions on Industrial Electronics | VOL. 66

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Extended Global–Local Representation Learning for Video Person Re-Identification

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: IEEE Access