Capsule-based Regression Tracking via Background Inpainting.

Ding Ma,Xiangqian Wu

doi:10.1109/tip.2023.3269229

Abstract

Background cues play an accompanying role in most regression trackers, where they directly learn a mapping from dense sampling to soft label by giving a search area. In essence, the trackers need to identify a large amount of background information (i.e., other objects and distractor objects) under the circumstance of extreme target-background data imbalance. Therefore, we believe that it is more worth performing regression tracking depending on the informative background cues and using target cues as supplementary. To do this, we propose a capsule-based approach, referred to as CapsuleBI, which performs regression tracking based on a background inpainting network and a target-aware network. The background inpainting network explores the background representations by restoring the region of the target with all available scenes, and a target-aware network captures the target representations by focusing on the target itself only. To explore the subjects/distractors in the whole scene, we propose a global-guided feature construction module, which helps enhance the local features with global information. Both the background and target are encoded in capsules, which can model the relationships between objects or object parts in the background scene. Apart from this, the target-aware network assists the background inpainting network with a novel background-target routing algorithm that guides the background and target capsules to estimate the target location with multi-video relationships information precisely. Extensive experimental results show that the proposed tracker achieves favorably against state-of-the-art methods.

Full Text