Self-supervised multi-object tracking based on metric learning

Xin Feng,Yan Liu,Hanzhi Yang,Xiaoning Jiao,Zhi Liu

doi:10.1007/s40747-024-01475-3

Abstract

The current paradigm of joint detection and tracking still requires a large amount of instance-level trajectory annotation, which incurs high annotation costs. Moreover, treating embedding training as a classification problem would lead to difficulties in model fitting. In this paper, we propose a new self-supervised multi-object tracking based on the real-time joint detection and embedding (JDE) framework, which we termed as self-supervised multi-object tracking (SS-MOT). In SS-MOT, the short-term temporal correlations between objects within and across adjacent video frames are both considered as self-supervised constraints, where the distances between different objects are enlarged while the distances between same object of adjacent frames are brought closer. In addition, short trajectories are formed by matching pairs of adjacent frames using a matching algorithm, and these matched pairs are treated as positive samples. The distances between positive samples are then minimized for futher the feature representation of the same object. Therefore, our method can be trained on videos without instance-level annotations. We apply our approach to state-of-the-art JDE models, such as FairMOT, Cstrack, and SiamMOT, and achieve comparable results to these supevised methods on the widely used MOT17 and MOT20 challenges.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Self-supervised multi-object tracking based on metric learning

Abstract

Talk to us

Similar Papers

More From: Complex & Intelligent Systems

Lead the way for us

Journal: Complex & Intelligent Systems	Publication Date: Jul 3, 2024
License type: CC BY 4.0

Similar Papers

Tampering detection and localization in digital video using temporal difference between adjacent frames of actual and reconstructed video clip
Vaishali Joshi ... Sanjay Jain
International Journal of Information Technology | VOL. 12
Vaishali Joshi, et. al.Vaishali Joshi ... Sanjay Jain
01 Jan 2019
International Journal of Information Technology | VOL. 12

Image-registration-based local noise reduction for noisy video sequences
Nan Jiang ... Jennie Si
-
Nan Jiang, et. al.Nan Jiang ... Jennie Si
05 May 2006
05 May 2006

Image-Registration-Based Local Noise Reduction for Noisy Video Sequences
Nan Jiang ... Glen Abousleman
-
Nan Jiang, et. al.Nan Jiang ... Glen Abousleman
01 Apr 2007
01 Apr 2007

Preliminary study of block matching algorithms for wavelet-based t+2D video coding
M Marzougui ... R Tourki
-
M Marzougui, et. al.M Marzougui ... R Tourki
01 Mar 2013
01 Mar 2013

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Self-supervised multi-object tracking based on metric learning

Abstract

Talk to us

Similar Papers

More From: Complex & Intelligent Systems