Toward reliable fusion object detection based on dilated pyramid and semantic attention

Rong Chang,Shan Gao,Hao Li,Shan Zhao,Yang Yang

doi:10.1002/eng2.12714

Rong Chang, Shan Gao + Show 3 more

Open Access

PDF Available

https://doi.org/10.1002/eng2.12714

Copy DOI

Export

Save

Cite

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

AbstractObject detection on fused images of visible and infrared modals is of great importance for many applications, for example, surveillance and rescue at low‐light conditions. However, current detectors have difficulty for robust fused image detection for mainly two reasons. First, objects are presented in various shapes and sizes, making some hard samples cannot be localized accurately. Second, the same object category in the fused images will have different appearance due to changing weather condition, temperature and intrinsic heat. Such a contradiction will degrade the classification task of a detection network, since it cannot merge commonalities and distinguish differences well. In this paper, we propose to reconstruct the detection pipeline of current detectors, and enhance the detection ability on difficult samples in fused images. Specifically, a Dilation Pyramid Network (DPN) is designed at the lateral connection to generate and aggregate features of various receptive field, without increasing pyramid layers. To strengthen the classification, a Semantic Category Attention Module (SCAM) is proposed to capture attention centers of semantics in fused images, rather than object centers. Abundant experiments on two fusion datasets show that the proposed method achieves a satisfying performance, and both modules can greatly improve current generic detectors on fused images.

Full Text