Abstract

Trackers based on Siamese network have achieved positive performance in recent days. However, most of the existing siamese single object trackers only consider the spatial information in the template which was given in the first frame of the video but do not extract the affluent temporal information. In this paper, we propose a novel tracking framework based on a spatial-temporal network. Specifically, we introduce three-way decision theory into object tracking to avoid interference from complex situations such as occlusions, fast motions, and non-rigid deformation. Furthermore, our proposed method can generate more precise tracking results due to the discriminative correlation filters (DCF). Extensive tests and comparisons with numerous competitive trackers on demanding large-scale benchmarks, including OTB-2015, GOT-10k, LaSOT and VOT2018, TrackingNet, demonstrate that our tracker outperforms many state-of-the-art real-time techniques while operating at 22 frames per second.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call