Abstract

Appearance information is one of the most important matching indicators for multi-object data association. In tracking by detection model, appearance information and detection information are usually integrated in the same sub-network for learning and output. This phenomenon will result in the appearance embedding vectors to be coupled to the network inference method during learning stage. As a result, the appearance embedding vectors contains too much background information and affects the accuracy of data association. Based on the non-end-to-end tracking model, we design an appearance guidance attention module for the appearance extraction branch. This module can effectively strengthen the network's learning of the object visual appearance features and reduce the attention in the learning of background features. Finally, we utilize the appearance embedding vectors that decoupled from the inference method as the input of the back-end tracker and perform data association. The proposed method is tested on the MOT16 and MOT17 datasets. Experiments show that the proposed method provides more high-quality appearance representation information for the back-end tracker and the tracking performance on two datasets is better than other comparison models. At the same time, our model can reach 24.9FPS on a single 1080Ti GPU. The code is available at: https://github.com/JoJoliking/AGM

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call