In complex road scenes, we propose an enhanced road target detection algorithm for YOLOv7-Tiny to address issues such as large model size, misclassification, and low localization accuracy. Our approach involves several key modifi-cations. Firstly, we replace the LeakyReLU activation function in YOLOv7-Tiny with H-swish. This replacement not only reduces the number of parameters in the model but also enhances its feature extraction capabilities. Additionally, we replace the ELAN module in the Neck with the DPCH-ELAN module and introduce the darknetblock module along with the pcconv convolution. These modifications improve the network's ability to comprehend complex patterns and semantics, thereby enabling it to capture features at different levels of input data. Moreover, we introduce the pcconv convolution building block at the output side to handle heterogeneous information in complex road scenes, thereby enhancing the network's performance in detecting road targets and abnormalities. In our experiments using a mul-ti-source dataset, the improved model exhibits a reduction in GFLOPs by 20.90% and a decrease in the number of pa-rameters by 24.49%. Furthermore, the mean average precision scores (map) at thresholds of 0.5 and 0.5~0.9 are im-proved to 77.3% and 51.8%, respectively, compared to the original YOLOv7-Tiny model. These experimental results demonstrate that our enhanced model achieves a reduction in model size while simultaneously enhancing detection accuracy, thereby meeting the requirements for real-time detection. To assess the generalizability of our approach, we conducted comparison experiments on the VOC2012 dataset. The results indicate that the improved algorithm exhibits robust generalization capabilities across different datasets.
Read full abstract