Abstract

The popular tracking-by-detection paradigm of multi-object tracking (MOT) takes detections of each frame as the input and associates detections from one frame to another. Existing association methods based on the relative motion have attracted attention, because they restrain the effect of noisy detections and improve the performance of MOT. However, these methods depend only on the immediately previous frame, which may easily lead to inaccurate matches and even large accumulated errors. Furthermore, multiple objects involved in occlusions are not fully exploited in these existing methods, which leads to the aggravation of inaccurate matches. Motivated by these issues, we design the pivot to represent each object and propose a novel pivot association network (PANet) for the MOT task. Specifically, pivots are learned from spatial semantic and historical contextual clues, which alleviates the dependency on the immediately previous frame. Our online tracker PANet employs pivots and a lightweight associator to localize tracklets of objects, which can inhibit noise detections and improve the accuracy of tracklet prediction by learning the correlation responses between pivots and spatial search areas. Extensive experiments conducted on two-dimensional MOT15, MOT16, MOT17, and MOT20 demonstrate the effectiveness of the proposed method against numerous state-of-the-art MOT trackers.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call