Addressing the challenges in online multi-object tracking algorithms under complex scenarios, where the independence among feature extraction, object detection, and data association modules leads to both error accumulation and the difficulty of maintaining visual consistency for occluded objects, we have proposed an end-to-end multi-object tracking method based on hypergraph matching (JDTHM). Initially, a feature extraction and object detection module is introduced to achieve preliminary localization and description of the objects. Subsequently, a deep feature aggregation module is designed to extract temporal information from historical tracklets, amalgamating features from object detection and feature extraction to enhance the consistency between the current frame features and the tracklet features, thus preventing identity swaps and tracklet breaks caused by object detection loss or distortion. Finally, a data association module based on hypergraph matching is constructed, integrating with object detection and feature extraction into a unified network, transforming the data association problem into a hypergraph matching problem between the tracklet hypergraph and the detection hypergraph, thereby achieving end-to-end model optimization. The experimental results demonstrate that this method has yielded favorable qualitative and quantitative analysis results on three multi-object tracking datasets, thereby validating its effectiveness in enhancing the robustness and accuracy of multi-object tracking tasks.
Read full abstract