Online and real-time mask-guided multi-person tracking and segmentation

Jin Seong

doi:10.1016/j.patrec.2023.06.001

Abstract

Applications of multi-object tracking and segmentation (MOTS) include autonomous driving and video surveillance systems. In these applications, the tracking system must be able to track in real-time and online. Recent studies to improve the tracking performance of MOTS have been actively conducted; however, most studies fail to consider online and real-time performance.This paper proposes a MOTS system that operates in real time using a deep learning–based model with a light backbone, and can be tracked online.Additionally, state-of-the-art trackers use an object’s bounding box to extract re-identification (ReID) features, but include unnecessary background. Additionally, this causes confusion in accurately expressing the appearance features of an object, resulting in difficulty in matching.To solve this difficulty, instead of providing the features of the bounding box as the input of the ReID branch, we focus on expressing the object by providing the mask features of the object as an input.As a result of the mask-based ReID experiment, the association accuracy performance was higher than that of the existing bounding box–based ReID model, and among the KITTI MOTS benchmarks, it ranked second among models that can operate online. Our experiments show that background information causes ambiguous ReID matching in MOTS systems, and that object mask information is important for avoiding ambiguous matching.

Full Text