Abstract

AbstractMulti-object tracking (MOT) is a task to identify objects in videos, however, objects with similar appearance or occlusion may cause frequent ID switching, which is the main challenge of current MOT. In this paper, we propose a novel self-cross graph neural network-based multi-object tracking method, which we termed as SCGTracker. This method seamlessly integrates object detection and tracking through graph neural networks, building upon the foundation of the JDE paradigm. Specifically, we construct graph structures to capture the correlation between objects in both spatial and temporal dimensions. To further tackle the frequent ID switching problem, we employ an attention mechanism to aggregate object context information within the same frame and across different frames, updating the object information via graph neural networks to derive highly distinctive appearance features. Ultimately, the obtained strongly distinguishable object appearance features serve to mitigate the issue of frequent object ID switches. In experiments conducted on the MOT17 test set, our proposed method yields promising results, achieving a 73% Multiple Object Tracking Accuracy (MOTA) and a 73.2% ID F1 score. Furthermore, it demonstrates a substantial reduction in ID switches compared with state-of-the-art methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call