Abstract

Recent years have witnessed significant improvements of ensemble trackers based on independent models. However, existing ensemble trackers only combine the responses of independent models and pay less attention to the learning process, which hinders their performance from further improvements. To this end, we propose an interactive learning framework to strengthen the independent models in the learning process. Specifically, in the interactive network, we force convolutional filter models to interact with each other by sharing their responses during the learning. The interaction between the convolutional filter models can mine hard samples and prevent easy samples from overwhelming them, which improve their discriminative capacity. In addition, to achieve a more accurate target location, we develop a fusion mechanism based on the confidences of the independent predictions. We evaluate the proposed method on five public datasets including OTB-2013, OTB-2015, VOT-2016, VOT-2017, Temple-Color-128, and LaSOT. The comprehensive experimental results show that the proposed algorithm performs favorably against state-of-the-art methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call