Cucumber downy mildew is caused by the infection of leaves with downy mildew spores. However, research on the prevention and control of cucumber downy mildew often focuses on the stage after symptoms have appeared on the leaves, that is, once disease spots have already formed. Since the occurrence of downy mildew is closely related to the quantity of spores, early-stage research on the quantity of downy mildew spores is of great significance for the prevention and control of cucumber downy mildew. Consequently, developing a rapid, accurate, and efficient method for detecting cucumber downy mildew spores is critical for advancing disease control. This study introduces an improved YOLOv5s model for spore detection. The model incorporates a transformer module into YOLOv5s’s backbone, enhancing global feature information extraction. It also adds a small object detection head to counter YOLOv5s’s extensive down-sampling and difficulty in learning features of small objects. Integration with the Convolutional Block Attention Module (CBAM) further refines detection precision for small objects like mildew spores. Upon evaluation with an image dataset collected through a microscope, the improved YOLOv5s model demonstrated superior performance metrics across various resolutions. At a resolution of 1440px × 1440px, it achieved the highest mean Average Precision (mAP@.5) of 95.4 %, a precision (P) score of 89.1 %, and a recall (R) rate of 90.3 %. These metrics surpassed the original YOLOv5s model at the same 1440px × 1440px resolution by 1.6 % in mAP@.5, 1.6 % in P, and 0.5 % in R. Additionally, the model’s mAP@.5 across various resolution scales indicates superior detection precision compared to other leading models like YOLOv7. In the context of microscopic images with small spores and complex backgrounds, the improved YOLOv5s model effectively detects cucumber downy mildew spores, offering valuable insights and technical support for advancing the prevention and control measures against cucumber downy mildew.