Accurate and reliable tracking of multi-pedestrian is of great importance for autonomous driving, human-robot interaction and video surveillance. Since different scenarios have different best-performing sensors, sensor fusion perception plans are believed to have complementary modalities and be capable of handling situations which are challenging for single sensor. In this paper, we propose a novel track-to-track fusion strategy for multi-pedestrian tracking by using a millimeter-wave (MMW) radar and a monocular camera. Pedestrians are firstly tracked by each sensor according to the sensor characteristic. Specifically, the 3D monocular pedestrian detections are obtained by a convolutional neural network (CNN). The trajectory is formed by the tracking-by-detection approach, combined with Bayesian estimation. The measurement noise of the 3D monocular detection is modeled by a detection uncertainty value obtained from the same CNN, as an approach to estimate the pedestrian state more accurately. The MMW radar utilizes the track-before-detection method due to the sparseness of the radar features. Afterwards, the pedestrian trajectories are obtained by the proposed track-to-track fusion strategy, which can work adaptively under challenging weather conditions, low-illumination conditions and clutter scenarios. A group of tests are carried out to validate our pedestrian tracking strategy. Tracking trajectories and optimal sub-pattern assignment (OSPA) metric demonstrate the accuracy and robustness of the proposed multi-sensor multi-pedestrian tracking system.
Read full abstract