To address the challenges of dense scenarios with densely distributed small-scale faces, severe occlusions, and unclear features leading to inaccurate detection and high miss rates, we propose a lightweight small-scale face detection algorithm based on YOLOv5. The aim is to enhance the accuracy and precision of target detection. Firstly, we introduce the Convolutional Block Attention Module (CBAM) into the existing backbone network, obtaining more detailed features by comprehensively considering both spatial and channel dimensions. Next, in the Neck network, we embed involution to enhance channel information and weight distribution. Finally, a new feature fusion layer is added to improve the capture capability of feature information for smaller pixels and smaller targets in visible areas by integrating deep semantic information with shallow semantic information. The experimental results demonstrate that the improved model exhibits an increase in the average precision across all three subsets of the public WIDER FACE dataset, with improvements of 3.2%, 3.4%, and 2.6% respectively. The detection frame rate reaches 87 frames per second (FPS), significantly enhancing the detection performance of facial targets. This improvement meets the accuracy and real-time requirements for detecting small-scale facial targets in dense scenarios.