Abstract

We consider the problem of automatically re-identifying a person of interest seen in a “probe” camera view among several candidate people in a “gallery” camera view. This problem, called person re-identification, is of fundamental importance in several video analytics applications. While extracting knowledge from high-dimensional visual representations based on the notions of sparsity and regularization has been successful for several computer vision problems, such techniques have not been fully exploited in the context of the re-identification problem. Here, we develop a principled algorithm for the re-identification problem in the general framework of learning sparse visual representations. Given a set of feature vectors for a person in one camera view (corresponding to multiple images as they are tracked), we show that a feature vector representing the same person in another view approximately lies in the linear span of this feature set. Furthermore, under certain conditions, the associated coefficient vector can be characterized as being block sparse. This key insight allows us to design an algorithm based on block sparse recovery that achieves state-of-the-art results in multi-shot person re-identification. We also revisit an older feature transformation technique, Fisher discriminant analysis, and show that, when combined with our proposed formulation, it outperforms many sophisticated methods. Additionally, we show that the proposed algorithm is flexible and can be used in conjunction with existing metric learning algorithms, resulting in improved ranking performance. We perform extensive experiments on several publicly available datasets to evaluate the proposed algorithm.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call