Joint Prediction and Association for Deep Feature Multiple Object Tracking

Wenyuan Qin,Tian Luo,Xuebin Ren,Xiaozheng Zhang,Zhiyang Ma,Hong Du

doi:10.1088/1742-6596/2026/1/012021

Abstract

Deep learning (CNN) can significantly improve the accuracy of image recognition with its powerful features, but the low-level network layer also contains important feature information. In order to achieve more stable and efficient tracking in multi-target tracking, this type of deep features will also be used to make the features more expressive by integrating the data from the front and back layers. The deformable convolution is also introduced to overcome the deformation problem caused by the camera motion. And with the increase of time, we predict the position of the target by the motion model, so as to remove the position where the target is impossible to reach in physical space, and further optimize the association before multiple targets. In this paper, we use an end-to-end correlation method to reduce the complexity of the algorithm. We tested it on the open source dataset MOT17 dataset and obtained remarkable results.

Full Text