Abstract

• APPK module is proposed to tackle misalignments of different scales of objects. • IoU-adaptive loss function helps networks to deal with the hard negative samples. • SORR module is devised to improve the detection efficiency. • Interleaved subsampling method can enhance feature representations. Using single-scale prediction kernels or Region of Interest (RoI) pooling in the prediction modules of modern object detectors is not very successful in matching different scales of objects. State-of-the-art detectors with the feature pyramid structure built on different resolutions of feature maps can help alleviate this problem. Although with this structure, single-scale prediction kernels or RoI pooling still struggles to detect small objects, and simultaneously, the former continues to encounter the misalignment problem on very large objects. In this paper, we propose the attention-guided pyramidal prediction kernels module with a customized IoU-adaptive loss function to deal with the misalignment problem between the prediction module and different scales of objects. To mitigate the effect of heavy detection head, we also introduce the salient object regions recognition module to identify these regions that have strong object cues. Additionally, interleaved subsampling, as the proposed feature enhancement approach, is used to generate highly discriminative feature representations. We refer to the detection framework constituted by these proposed methods as SnipeDet. Results show that SnipeDet achieves 41.1 AP at the speed of 15.4 FPS on the MS COCO test-dev set with 512 × 512 input images, which outperforms state-of-the-art one-stage detectors and has a better trade-off between speed and accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call