Abstract

Dual modal image pairs can provide complementary feature information and overcome the limitations of single modal object detection algorithms, thus improving detection performance. In order to fully utilize the features of different modalities, this paper proposes a dual-branch fusion detection network that can simultaneously input both infrared and visible modalities for object detection. The method is based on YOLOv5-s, and two attention modules are proposed according to the characteristics of different modalities to enhance the expression capability of infrared and visible image feature information respectively; At the same time, a dual modal fusion module is designed for cross-modal information complementarity by fusing features from corresponding scales of both modalities; Finally, a feature enhancement module is proposed to improve the multi-scale feature fusion capability. The algorithm is validated on the KAIST, FLIR, and GIR datasets and compared with classic single modal and dual modal detection algorithms. Experimental results show that, compared with the baseline algorithm YOLOv5-s alone detecting visible and infrared images separately, the proposed algorithm improves the detection accuracy by 17.6% and 5.9% on the KAIST dataset respectively, and 19.1% and 13% on the FLIR dataset respectively. The proposed algorithm also has a significant advantage in detection accuracy on the self-built GIR dataset.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call