Abstract

Multi-object tracking and segmentation (MOTS) is a derivative task of multi-object tracking (MOT). The new setting encourages the learning of more discriminative high-quality embeddings. In this paper, we focus on the problem of exploring the relationship between the segmenter and the tracker, and propose an efficient Object Point set Inductive Tracker (OPITrack) based on it. First, we discover that after a single attention layer, the high-dimensional, key point embedding will show feature averaging. To alleviate this phenomenon, we propose an embedding generalization training strategy for sparse training and dense testing. This strategy allows the network to increase randomness in training and encourages the tracker to learn more discriminative features. In addition, to learn the desired embedding space, we propose a general Trip-hard sample augmentation loss. The loss uses patches that are not distinguishable by the segmenter to join the feature learning and force the embedding network to learn the difference between false positives and true positives. Our method was validated on two MOTS benchmark datasets and achieved promising results. In addition, our OPITrack can achieve better performance for the raw model while costing less video memory (VRAM) at training time.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.