Visual object tracking with adaptive structural convolutional network

Di Yuan,Xin Li,Zhenyu He,Qiao Liu,Shuwei Lu

doi:10.1016/j.knosys.2020.105554

Di Yuan, Xin Li + Show 3 more

Open Access

https://doi.org/10.1016/j.knosys.2020.105554

Copy DOI

Abstract

Convolutional Neural Networks (CNN) have been demonstrated to achieve state-of-the-art performance in visual object tracking task. However, existing CNN-based trackers usually use holistic target samples to train their networks. Once the target undergoes complicated situations (e.g., occlusion, background clutter, and deformation), the tracking performance degrades badly. In this paper, we propose an adaptive structural convolutional filter model to enhance the robustness of deep regression trackers (named: ASCT). Specifically, we first design a mask set to generate local filters to capture local structures of the target. Meanwhile, we adopt an adaptive weighting fusion strategy for these local filters to adapt to the changes in the target appearance, which can enhance the robustness of the tracker effectively. Besides, we develop an end-to-end trainable network comprising feature extraction, decision making, and model updating modules for effective training. Extensive experimental results on large benchmark datasets demonstrate the effectiveness of the proposed ASCT tracker performs favorably against the state-of-the-art trackers.

Full Text