Abstract

In order to be adapted to high-level vision tasks, several infrared and visible image fusion methods cascade with the downstream network to enhance the semantic information of fusion results. However, due to the feature-level heterogeneities between fusion and downstream tasks, these methods suffer from the loss of pixel-level information and incomplete reconstruction of semantic-level information. To further improve the performance of fusion images in high-level vision tasks, we propose a semantic-driven coupled network for infrared and visible image fusion, terms as SDCFusion. Firstly, to address feature heterogeneity, we couple the segmentation and fusion networks into a joint framework such that both networks share the multi-level cross-modality coupled features. Based on the joint optimization of dual tasks, a joint action between fusion and downstream tasks is formed to force the cross-modality coupled features modeled on both pixel domain and semantic domain. Subsequently, to guide the semantic information reconstruction, we cascade two networks to form the semantic-based driven action, which continuously optimizes the fusion image to achieve semantic representation capacity. In addition, we introduce an adaptive training strategy to reduce the complexity of dual-task training. Specifically, an mIoU-based semantic measurement weight is designed to balance the joint action and driven action throughout the training process. We evaluate our method at both pixel information and semantic information levels, respectively. The qualitative and quantitative experiments verify the superiority of SDCFusion in terms of visual effects and metrics. The object detection and semantic segmentation experiments demonstrate that SDCFusion achieves superior performance in high-level vision tasks. The source code is available at https://github.com/XiaoW-Liu/SDCFusion.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call