Abstract

Recently, deep convolutional neural network (CNN) based methods for multi-focus image fusion have achieved adequate performance. However, most of them cannot obtain spatially continuous results, especially in smooth regions and edges between focused and defocused regions. In this paper, we propose a novel end-to-end method, which merits both Transformers and CNNs, as a strong alternative for multi-focus image fusion task. Transformer has advantages over a CNN in that it can extract global features. It is able to make the fusion results to be spatially consistent. The proposed architecture consists of CNN and transformer branches, where transformer branches take feature map patches as inputs and leverages the transformer to propagate global contexts among patches. Moreover, in order to improve feature representation, we introduce online knowledge distillation learning strategy (KDL). The strategy achieves better interactions between global features and local features. Specifically, we design hard target and soft target by simply yet effectively ensembling outputs of two branches, which are used to supervise CNN and transformer branches. The experiments demonstrate the superiority of our proposed architecture and achieve competitive results with state-of-the-art methods.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.