Abstract

Recently, L1 tracker has been widely applied and received great success in visual tracking. However, most L1 trackers use only the image intensity for sparse representation, which is insufficient to represent the object especially when drastic appearance changes occur. Convolutional neural network (CNN) has demonstrated remarkable capability in a wide range of computer vision fields, and features extracted from different convolutional layers have different characteristics. In this paper, we propose a novel sparse representation model with convolutional features for visual tracking. Besides, to alleviate the redundancy from high-dimensional convolutional features, a feature selection method is adopted to remove noisy and irrelevant feature maps, which can reduce computation redundancy and improve tracking accuracy. Different from traditional sparse representation based tracking methods, our model not only exploits convolutional features to improve the robustness for describing the object appearance but also uses the trivial templates to model both reconstruction errors caused by sparse representation and the eigen-subspace representation. In addition, an unified objective function is proposed and a customized APG method is developed to effectively solve the optimization problem. Numerous qualitative and quantitative evaluations demonstrate that our tracker outperforms other state-of-the-art trackers in a wide range of tracking scenarios.

Highlights

  • 1 Introduction Visual tracking plays an important role in computer vision and has received fast-growth attention in recent years due to its wide practical application such as pedestrian detection, vehicle navigation, security surveillance, and wireless communication [1,2,3,4]

  • The proposed approach use a novel sparse representation model with convolutional features for visual tracking, which exploits Convolutional neural network (CNN) features to improve the robustness for describing the object appearance and uses the trivial templates to model both reconstruction errors caused by sparse representation and the eigensubspace representation

  • Our algorithm introduces CNN features in describing the target template set

Read more

Summary

Introduction

Visual tracking plays an important role in computer vision and has received fast-growth attention in recent years due to its wide practical application such as pedestrian detection, vehicle navigation, security surveillance, and wireless communication [1,2,3,4]. Kim et al [23] propose a novel structure-preserving sparse learning method, which preserves both local geometrical and discriminative structures within a multi-task feature selection framework Most of these methods mainly aim at improving the tracking accuracy or efficiency, they usually use the image intensity to construct the template set, which is less effective in expressing the structural information of the target, cannot cover severe appearance changes of the target object. To alleviate redundancy of high-dimensional convolutional features, a feature selection method is adopted, which can reduce computation complexity and improve tracking accuracy This strategy makes the model jointly exploit the advantages of the CNN features with more structural information to effectively represent the target, and of both sparse representation and the incremental subspace learning simultaneously.

Proposed model
Optimization and the tracking algorithm
Particle filter tracking framework
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call