The intelligent transportation system (ITS) is inseparable from people’s lives, and the development of artificial intelligence has made intelligent video surveillance systems more widely used. In practical traffic scenarios, the detection and tracking of vehicle targets is an important core aspect of intelligent surveillance systems and has become a hot topic of research today. However, in practical applications, there is a wide variety of targets and often interference factors such as occlusion, while a single sensor is unable to collect a wealth of information. In this paper, we propose an improved data matching method to fuse the video information obtained from the camera with the millimetre-wave radar information for the alignment and correlation of multi-target data in the spatial dimension, in order to address the problem of poor recognition alignment caused by mutual occlusion between vehicles and external environmental disturbances in intelligent transportation systems. The spatio-temporal alignment of the two sensors is first performed to determine the conversion relationship between the radar and pixel coordinate systems, and the calibration on the timeline is performed by Lagrangian interpolation. An improved Hausdorff distance matching algorithm is proposed for the data dimension to calculate the similarity between the data collected by the two sensors, to determine whether they are state descriptions of the same target, and to match the data with high similarity to delineate the region of interest (ROI) for target vehicle detection.