Abstract

The recent success of deep network in visual trackers learning largely relies on human labeled data, which are however expensive to annotate. Recently, some unsupervised methods have been proposed to explore the learning of visual trackers without labeled data, while their performance lags far behind the supervised methods. We identify the main bottleneck of these methods as inconsistent objectives between off-line training and online tracking stages. To address this problem, we propose a novel unsupervised learning pipeline which is based on the discriminative correlation filter network. Our method iteratively updates the tracker by alternating between target localization and network optimization. In particular, we propose to learn the network from a single movie, which could be easily obtained other than collecting thousands of video clips or millions of images. Extensive experiments demonstrate that our approach is insensitive to the employed movies, and the trained visual tracker achieves leading performance among existing unsupervised learning approaches. Even compared with the same network trained with human labeled bounding boxes, our tracker achieves similar results on many tracking benchmarks. Code is available at: https://github.com/ZjjConan/UL-Tracker-AAAI2019.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.