Single-stage detectors have drawbacks of insufficient accuracy and poor coverage capability. YOLOF (You Only Look One-level Feature) has achieved better performance in this regard, but there is still room for improvement. To enhance the coverage capability for objects of different scales, we propose an improved single-stage object detector: Dq-YOLOF. We have designed an output encoder that employs a series of modules utilizing deformable convolution and SimAM (Simple Attention Module). This module replaces the dilated convolution in YOLOF. This design significantly improves the ability to express details. Simultaneously, we have redefined the sample selection strategy, which optimizes the quality of positive samples based on SimOTA. It can dynamically allocate positive samples according to their quality, reducing computational load and making it more suitable for small objects. Experiments conducted on the COCO 2017 dataset also verify the effectiveness of our method. Dq-YOLOF achieved 38.7 AP, 1.5 AP higher than YOLOF. To confirm performance improvements on small objects, our method was tested on urinary sediment and aerial drone datasets for generalization. Notably, it enhances performance while also lowering computational costs.
Read full abstract