Efficient and accurate identification of smoking behavior in public places is crucial for ensuring public health and safety. However, due to various factors like small target size, complex image background, varying cigarette angles and numerous similar objects, current methods still grapple with challenges such as missed detection and false detection when identifying cigarette targets in smoking behavior. This paper presents an enhanced version of the You Only Look Once version5-small algorithm to address these issues effectively. Firstly, to bolster the model's ability for feature extraction of cigarette targets, the Swin Transformer Block structure is integrated into the backbone network to capture long-range dependencies. Secondly, a novel Hybrid Spatial Pyramid Pooling-Fast with Cross Stage Partial Connection module is built based on the foundation of the Spatial Pyramid Pooling-Fast module by integrating both maximum pooling and average pooling to enhance the fusion ability of the multi-scale feature maps. Thirdly, a novel interference information filtering network is introduced to effectively reduce the impact of noise and confusion caused by similar objects, thus enhancing the performance of the model. According to the experimental results, the accuracy of the improved algorithm on the self-made cigarette target image data set reaches 93.5 %, and the recall reaches 89.1 %.
Read full abstract