Construction site environment helmet detection is of great significance for protecting workers’ lives and realizing the automation of safety management. Aiming at the current object detection methods for the complex construction site environment in the small-scale helmet object detection ability is insufficient. This paper proposes a construction site environment helmet detection method based on multi-scale context and attention fusion. The method is able to aggregate the multi-scale contextual semantics of deep image features through the proposed multi-scale context module and expand the receptive field in order to improve the network’s discriminative learning ability for small-scale helmet objects. Meanwhile, the proposed attention feature fusion module dynamically fuses features from shallow features and network decoding features to enhance the network’s ability to learn the expression of global feature dependencies and local spatial detail features of helmet objects, and further improve the network’s detection precision of helmet objects. The experimental results show that on the constructed safety helmet wearing dataset, the proposed method in this paper has good detection effect and balanced detection speed compared with the existing mainstream object detection methods.