Abstract

Most Siamese trackers decompose the tracking task into two branches: classification and bounding box regression. They choose the bounding box with the highest classification score or the combination of the classification and predicted localization scores as the target. However, the misalignment between the classification and localization accuracy will degrade tracking performance. In this paper, we propose an IoU-aware Siamese tracker named IASNet, which predicts the IoU classification score of each regressed box to represent its localization confidence. We introduce a novel residual alignment module to capture the scale of each predicted bounding box and its border context, which further improves the reliability of the classification prediction. In addition, we build dynamic links between the classification and regression branches to enhance their interaction. The proposed dynamic loss enables the training to focus on high-quality positive samples, leading to more precise localization. Extensive experimental results on VOT2016, VOT2019, OTB100, GOT10k, UAV123, and LaSOT show that the proposed IASNet achieves state-of-the-art tracking performance and runs at approximately 65 fps on a GTX 3090 GPU, which far exceeds the real-time requirement.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call