Consistency-based self-supervised visual tracking by using query-communication transformer

Wenjun Zhu,Li Xu,Jun Meng

doi:10.1016/j.knosys.2023.110849

Abstract

Self-supervised learning (SSL) performs remarkably in visual tracking since it enables the extraction of general representations from unlabeled data and alleviates the need for expensive human annotations. SSL models usually achieve frame-to-frame communications during training by predicting each object location of intermediate frames, however, the possible prediction errors may accumulate and mislead the forward–backward tracking procedure. A novel query-communication transformer (QCT) architecture is proposed in this work to enable more reliable frame-to-frame communications via propagating query information, avoiding the above-mentioned tracking errors on intermediate frames tactfully. Specifically, we introduce the transformer into self-supervised tracking to handle the object template and search frames, i.e., the encoder encodes spatio-temporal context of template and search frames, while the decoder takes the query embedding of previous frame to retrieve the template object information from the encoder output. To further enhance the query embedding, a query interaction module is devised to promote information passing between frames. Moreover, we employ inter-frame correspondence and intra-frame correspondence to construct different views and transformations for better learning the representation from palindromic sequences. We validate our method on the seven challenging benchmarks. The results demonstrate considerable improvements over recent self-supervised algorithms and even some fully-supervised deep trackers.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Consistency-based self-supervised visual tracking by using query-communication transformer

Abstract

Talk to us

Similar Papers

More From: Knowledge-Based Systems

Lead the way for us

Journal: Knowledge-Based Systems	Publication Date: Jul 28, 2023
Citations: 3

Similar Papers

Spatiotemporal self-supervised predictive learning for atmospheric variable prediction via multi-group multi-attention
Zhensheng Shi ... Junyu Dong
Knowledge-Based Systems | VOL. 300
Zhensheng Shi, et. al.Zhensheng Shi ... Junyu Dong
13 Jun 2024
Knowledge-Based Systems | VOL. 300

Wearable Data From Subjects Playing Super Mario, Taking University Exams, or Performing Physical Exercise Help Detect Acute Mood Disorder Episodes via Self-Supervised Learning: Prospective, Exploratory, Observational Study.
Filippo Corponi ... Eduard Vieta
JMIR mHealth and uHealth | VOL. 12
Filippo Corponi, et. al.Filippo Corponi ... Eduard Vieta
17 Jul 2024
JMIR mHealth and uHealth | VOL. 12

Exploring PolSAR Images Representation via Self-Supervised Learning and Its Application on Few-Shot Classification
Wu Zhang ... Zongxu Pan
IEEE Geoscience and Remote Sensing Letters | VOL. 19
Wu Zhang, et. al.Wu Zhang ... Zongxu Pan
01 Jan 2021
IEEE Geoscience and Remote Sensing Letters | VOL. 19

A Novel Solution for EEG-based Emotion Recognition
Zhuofan Xie ... Haixin Sun
-
Zhuofan Xie, et. al.Zhuofan Xie ... Haixin Sun
13 Oct 2021
13 Oct 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Consistency-based self-supervised visual tracking by using query-communication transformer

Abstract

Talk to us

Similar Papers

More From: Knowledge-Based Systems