Multi-View Structural Local Subspace Tracking.

Jie Guo,Guokai Shi,Zhitao Rao,Xiangmin Li,Tingfa Xu

doi:10.3390/s17040666

Abstract

In this paper, we propose a multi-view structural local subspace tracking algorithm based on sparse representation. We approximate the optimal state from three views: (1) the template view; (2) the PCA (principal component analysis) basis view; and (3) the target candidate view. Then we propose a unified objective function to integrate these three view problems together. The proposed model not only exploits the intrinsic relationship among target candidates and their local patches, but also takes advantages of both sparse representation and incremental subspace learning. The optimization problem can be well solved by the customized APG (accelerated proximal gradient) methods together with an iteration manner. Then, we propose an alignment-weighting average method to obtain the optimal state of the target. Furthermore, an occlusion detection strategy is proposed to accurately update the model. Both qualitative and quantitative evaluations demonstrate that our tracker outperforms the state-of-the-art trackers in a wide range of tracking scenarios.

Highlights

Visual tracking plays an important role in computer vision and has received fast-growing attention in recent years due to its wide practical application
We test the performance of the proposed tracker with the total 51 sequences using in the visual tracker benchmark [2] and compare it with the top 12 state-of-the-art trackers, including SST [33], JSRFFT [36], DSSM [27], Struck [9], ASLA [30], L1APG [35], MTT [31], LSK [29], VTD [21], TLD [10], incremental visual tracking (IVT)
We propose a novel multi-view structural local subspace tracking algorithm based on sparse representation

Summary

Introduction

Visual tracking plays an important role in computer vision and has received fast-growing attention in recent years due to its wide practical application. The task is to track an unknown target (only a bounding box defining the object of interest in a single frame is given) in an unknown video stream This problem is especially challenging due to the limited set of training samples and the numerous appearance changes, e.g., rotations, scale changes, occlusions, and deformations. In the Singer sequence our tracker and the IVT tracker performs well in tracking the woman, while many other methods drift to the cluttered background or cannot adapt to scale changes when illumination change occurs. This can be attributed to the use of incremental subspace learning which is able to capture appearance change due to lighting change. In all these 25 sequence, our tracker generally outperforms the other trackers

Methods

Results

Conclusion