Wearing masks is an effective protective measure for residents to prevent respiratory infectious diseases when going out. Due to issues such as a small target size, target occlusion leading to information loss, false positives, and missed detections, the effectiveness of face mask-wearing detection needs improvement. To address these issues, an improved YOLOv7 object detection model is proposed. Firstly, the C2f_SCConv module is introduced in the backbone network to replace some ELAN modules for feature extraction, enhancing the detection performance of small targets. Next, the SPPFCSPCA module is proposed to optimize the spatial pyramid pooling structure, accelerating the model convergence speed and improving detection accuracy. Finally, the HAM_Detect decoupled detection head structure is introduced to mitigate missed and false detections caused by target occlusion, further accelerating model convergence and improving detection performance in complex environments. The experimental results show that improved YOLOv7 achieved an mAP of 90.1% on the test set, a 1.4% improvement over the original YOLOv7 model. The detection accuracy of each category improved, effectively providing technical support for mask-wearing detection in complex environments.