Abstract

Discriminative dictionary learning (DDL) provides an appealing paradigm for appearance modeling in visual tracking. However, most existing DDL-based trackers cannot handle drastic appearance changes, especially for scenarios with background cluster and/or similar object interference. One reason is that they often suffer from the loss of subtle visual information, which is critical to distinguish an object from distracters. In this paper, we explore the use of activations from the convolutional layer of a convolutional neural network to improve the object representation and then propose a robust distracter-resistive tracker via learning a multi-component discriminative dictionary. The proposed method exploits both the intra-class and inter-class visual information to learn shared atoms and the class-specific atoms. By imposing several constraints into the objective function, the learned dictionary is reconstructive, compressive, and discriminative, and thus can better distinguish an object from the background. In addition, our convolutional features have structural information for object localization and balance the discriminative power and semantic information of the object. Tracking is carried out within a Bayesian inference framework where a joint decision measure is used to construct the observation model. To alleviate the drift problem, the reliable tracking results obtained online are accumulated to update the dictionary. Both the qualitative and quantitative results on the CVPR2013 benchmark, the VOT2015 data set, and the SPOT data set demonstrate that our tracker achieves substantially better overall performance against the state-of-the-art approaches.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call