Robust Online Visual Tracking with a Single Convolutional Neural Network

Hanxi Li,Yi Li,Fatih Porikli

doi:10.1007/978-3-319-16814-2_13

Hanxi Li, Yi Li + Show 1 more

https://doi.org/10.1007/978-3-319-16814-2_13

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Deep neural networks, albeit their great success on feature learning in various computer vision tasks, are usually considered as impractical for online visual tracking because they require very long training time and a large number of training samples. In this work, we present an efficient and very robust online tracking algorithm using a single Convolutional Neural Network (CNN) for learning effective feature representations of the target object over time. Our contributions are multifold: First, we introduce a novel truncated structural loss function that maintains as many training samples as possible and reduces the risk of tracking error accumulation, thus drift, by accommodating the uncertainty of the model output. Second, we enhance the ordinary Stochastic Gradient Descent approach in CNN training with a temporal selection mechanism, which generates positive and negative samples within different time periods. Finally, we propose to update the CNN model in a “lazy” style to speed-up the training stage, where the network is updated only when a significant appearance change occurs on the object, without sacrificing tracking accuracy. The CNN tracker outperforms all compared state-of-the-art methods in our extensive evaluations that involve 18 well-known benchmark video sequences.

Full Text