Two-stage aware attentional Siamese network for visual tracking

Xinglong Sun,Guangliang Han,Lihong Guo,Hang Yang,Xiaotian Wu,Qingqing Li

doi:10.1016/j.patcog.2021.108502

Abstract

Siamese networks have achieved great success in visual tracking with the advantages of speed and accuracy. However, how to track an object precisely and robustly still remains challenging. One reason is that multiple types of features are required to achieve good precision and robustness, which are unattainable by a single training phase. Moreover, Siamese networks usually struggle with online adaption problem. In this paper, we present a novel two-stage aware attentional Siamese network for tracking (Ta-ASiam). Concretely, we first propose a position-aware and an appearance-aware training strategy to optimize different layers of Siamese network. By introducing diverse training patterns, two types of required features can be captured simultaneously. Then, following the rule of feature distribution, an effective feature selection module is constructed by combining both channel and spatial attention networks to adapt to rapid appearance changes of the object. Extensive experiments on various latest benchmarks have well demonstrated the effectiveness of our method, which significantly outperforms state-of-the-art trackers.

Full Text