Ensemble Learning-Based Person Re-identification with Multiple Feature Representations

Dapeng Tao,Xiaofang Liu,Qiongwei Ye,Yun Yang

doi:10.1155/2018/5940181

Dapeng Tao, Xiaofang Liu + Show 2 more

Open Access

https://doi.org/10.1155/2018/5940181

Copy DOI

Abstract

As an important application in video surveillance, person reidentification enables automatic tracking of a pedestrian through different disjointed camera views. It essentially focuses on extracting or learning feature representations followed by a matching model using a distance metric. In fact, person reidentification is a difficult task because, first, no universal feature representation can perfectly identify the amount of pedestrians in the gallery obtained by a multicamera system. Although different features can be fused into a composite representation, the fusion still does not fully explore the difference, complementarity, and importance between different features. Second, a matching model always has a limited amount of training samples to learn a distance metric for matching probe images against a gallery, which certainly results in an unstable learning process and poor matching result. In this paper, we address the issues of person reidentification by the ensemble theory, which explores the importance of different feature representations, and reconcile several matching models on different feature representations to an optimal one via our proposed weighting scheme. We have carried out the simulation on two well-recognized person reidentification benchmark datasets: VIPeR and ETHZ. The experimental results demonstrate that our approach achieves state-of-the-art performance.

Highlights

Person reidentification aims to recognize and associate a target pedestrian at different occasions after having previously appeared in several cameras with nonoverlapping views
We address the issues of person reidentification by the ensemble theory, which explores the importance of different feature representations, and reconcile several matching models on different feature representations to an optimal one via our proposed weighting scheme
Our experiment results still show that the matching model on the ensemble of four selected feature representations significantly outperforms the one with a single representation; a single matching model working on a composite representation formed by concatenating four selected feature representations together is often inferior to an ensemble of multiple matching models on different representations

Summary

Introduction

Person reidentification aims to recognize and associate a target pedestrian at different occasions after having previously appeared in several cameras with nonoverlapping views. Person reidentification has become increasingly popular in the community due to its application and research significance. Many researchers have studied this topic from different aspects of feature level and measurement level, and proposed a variety of approaches to improve the performance of human identity systems. They still face many challenges in practical applications: (a) chaotic public scenes, similar pedestrian characteristics, and obstructed pedestrians and (b) obvious changes in appearance due to different lighting conditions, camera parameters, body posture, and so on. In order to solve the above problems, researchers were committed to (1) find out the optimal feature representations and (2) develop robust matching models for promising accuracy

Results

Discussion

Conclusion