Abstract

Variation of scales or aspect ratios has been one of the main challenges for tracking. To overcome this challenge, most existing methods adopt either multi-scale search or anchor-based schemes, which use a predefined search space in a handcrafted way and therefore limit their performance in complicated scenes. To address this problem, recent anchor-free based trackers have been proposed without using prior scale or anchor information. However, an inconsistency problem between classification and regression degrades the tracking performance. To address the above issues, we propose a simple yet effective tracker (named Siamese Box Adaptive Network, SiamBAN) to learn a target-aware scale handling schema in a data-driven manner. Our basic idea is to predict the target boxes in a per-pixel fashion through a fully convolutional network, which is anchor-free. Specifically, SiamBAN divides the tracking problem into classification and regression tasks, which directly predict objectiveness and regress bounding boxes, respectively. A no-prior box design is proposed to avoid tuning hyper-parameters related to candidate boxes, which makes SiamBAN more flexible. SiamBAN further uses a target-aware branch to address the inconsistency problem. Experiments on benchmarks including VOT2018, VOT2019, OTB100, UAV123, LaSOT and TrackingNet show that SiamBAN achieves promising performance and runs at 35 FPS.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call