LTST: Long-term segmentation tracker with memory attention network

Lang Yu,Baojun Qiao,Huanlong Zhang,Junyang Yu,Xin He

doi:10.1016/j.imavis.2022.104374

Abstract

Recent interest in the combination of visual object tracking (VOT) and video object segmentation (VOS) has yielded rapid progress. However, existing segmentation methods are still restricted to the target model created in the first frame, which leads to the lack of long-term adaptability. To overcome this limitation, we propose a novel long-term segmentation tracker LTST leveraging memory attention network to achieve the effect of online learning without additional training. Specifically, we first combine a discriminative correlation filter with a matching-based paradigm for the segmentation task, then we develop a memory attention network based on partial cost volume to extract relevant historical information and dynamically reform the segmentation template. Moreover, we extend our LTST for long-term tracking by introducing a multi-scale verification network to identify tracking failures, and a global detector to re-locate the missing target. Experimental results on VOT-LT2018, VOT-LT2019, LaSOT and TLP benchmarks show that our proposed tracker achieves comparable performance to the state-of-the-art long-term tracking algorithms.

Full Text