Learning Spatial-Channel Attention for Visual Tracking

Yingsen Zeng,Haiying Wang,Ting Lu

doi:10.1109/iccchina.2019.8855908

Abstract

Convolutional neural networks have an advantage of strong representation and has been widely applied to visual tracking. However, simply deepening network in pursuit of boosting performance is inappropriate for tracking because of its speed requirement. We leverage spatial attention and channel attention to enhance features of objects without much extra computational cost. The fused spatial-channel attention enables network to extract discriminative and robust features of targets or background. Furthermore, we propose inter-instance loss to make our tracker be aware of not only target-background classification but also instances classification across multi-domains. Extensive experiments on Object Tracking Benchmark (OTB) show that the proposed tracker obtains an Area-Under-Curve (AUC) score of 66.8% on OTB2015, outperforming most of the state-of-art trackers.

Full Text