Abstract

A number of visual tracking methods achieve the state-of-the-art performance based on deep learning recently. However, most of these trackers utilize the deep neural network in regression task or classification task separately. In this paper, we propose an adversarial deep tracking framework. The framework is composed of a fully convolutional Siamese neural network (regression network) and a discriminative classification network. Then, we jointly optimize the regression network and the classification network by adversarial learning. In the uniform framework, the regression network and classification network can be trained end-to-end as a whole using large amounts of video training data sets. During the testing phase, the regression network generates a response map which reflects the location and the size of the target within each candidate search patch, and the classification network discriminates which response map is the best in terms of the corresponding template patch and candidate search patch. In addition, we propose an attention visualization algorithm for our tracker, and it reflects the area that attracts the attention of our tracker during tracking. The experimental results on three large-scale visual tracking benchmarks (OTB-100, TC-128, and VOT2016) demonstrate the effectiveness of the proposed tracking algorithm and show that our tracker performs comparably against the state-of-the-art trackers.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.