Abstract
Multi-object tracking (MOT) aims to detect objects in video sequences and associate them across frames. Currently, the mainstream research direction regarding MOT is the tracking-by-detection (TBD) framework. Tracking results are highly sensitive to detection outputs, and challenges from object occlusion and complex motion present significant obstacles in the field of MOT. To reduce dependence on detection outputs, we propose a method that integrates predictive information to improve Non-Maximum Suppression (NMS). By applying secondary modulation to the suppression scores and dynamically adjusting the suppression threshold using tracking information, our method better retains candidate boxes for occluded objects. Furthermore, to track occluding and overlapping objects more effectively, we introduce an adaptive measurement noise method that adjusts the measurement noise to mitigate the impact of object occlusion or overlap on tracking accuracy. Additionally, we enhance the affinity matrix in the association algorithm by incorporating height information, thereby improving the stability of complex moving objects. Our method outperforms the baseline model ByteTrack on the DanceTrack dataset, increasing Higher Order Tracking Accuracy (HOTA), Multi-Object Tracking Accuracy (MOTA), and the ID F1 Score (IDF1) by 10.2%, 3.0%, and 4.8%, respectively.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have