Abstract

Mainstream pedestrian recognition algorithms have problems such as low accuracy and insufficient real-time performance. In this study, we developed an improved pedestrian recognition algorithm named YOLO-MSP (multiscale parallel) based on residual network ideas, and we improved the network architecture based on YOLOv5s. Three pooling layers were used in parallel in the MSP module to output multiscale features and improve the accuracy of the model while ensuring real-time performance. The Swin Transformer module was also introduced into the network, which improved the efficiency of the model in image processing by avoiding global calculations. The CBAM (Convolutional Block Attention Module) attention mechanism was added to the C3 module, and this new module was named the CBAMC3 module, which improved model efficiency while ensuring the model was lightweight. The WMD-IOU (weighted multidimensional IOU) loss function proposed in this study used the shape change between the recognition frame and the real frame as a parameter to calculate the loss of the recognition frame shape, which could guide the model to better learn the shape and size of the target and optimize recognition performance. Comparative experiments using the INRIA public data set showed that the proposed YOLO-MSP algorithm outperformed state-of-the-art pedestrian recognition methods in accuracy and speed.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call