Abstract

A robust tracking method is proposed for complex visual sequences. Different from time-consuming offline training in current deep tracking, we design a simple two-layer online learning network which fuses local convolution features and global handcrafted features together to give the robust representation for visual tracking. The target state estimation is modeled by an adaptive Gaussian mixture. The motion information is used to direct the distribution of the candidate samples effectively. And meanwhile, an adaptive scale selection is addressed to avoid bringing extra background information. A corresponding object template model updating procedure is developed to account for possible occlusion and minor change. Our tracking method has a light structure and performs favorably against several state-of-the-art methods in tracking challenging scenarios on the recent tracking benchmark data set.

Highlights

  • Visual tracking is one important topic in computer vision with a wide range of applications, such as video surveillance, automobile navigation, human-computer interface, and driverless vehicle [1]

  • Convolutional neural network (CNN) for object recognition and detection has inspired tracking algorithms to employ the discriminative features learned by CNNs [3, 4]

  • To produce stable shared weights, CNN network needs a large number of training samples, while this is often not available in visual tracking as there exists only a few number of reliable positive instances extracted from the initial frame

Read more

Summary

Introduction

Visual tracking is one important topic in computer vision with a wide range of applications, such as video surveillance, automobile navigation, human-computer interface, and driverless vehicle [1]. Since each type of handcrafted feature is commonly able to address a few specific classical changes, they are not tailored for all generic objects and we require some sophisticated learning techniques to improve their representative capabilities These learning methods build models to distinguish the target from the background. Inspired by the success of convolutional neural network in image classification and object recognition [13,14,15], researchers in tracking community have started to focus on the deep trackers that exploit the strength of CNN. These deep trackers come from two aspects: One trend is discriminative convolution trackers (DCT). Online adaptive tracking and updating mechanism brings optimal estimation for target location and scale selection

Related Work
The Tracking Framework
Adaptive Tracking Algorithm
Experimental Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call