Abstract

In recent years, researchers have made significant strides in computer vision by leveraging transformers, achieving remarkable breakthroughs in low-level vision tasks. The inherent long-range dependency of transformers grants them potent remote modeling capabilities, surpassing those of Convolutional Neural Networks (CNNs) and enabling the extraction of global features and accurate semantic structures. However, it has been observed that single transformer frameworks lack sensitivity to high-frequency information in images, resulting in the generation of blurry reconstructed regions. To address this limitation, this paper proposes a novel two-branch Dual Frequency Feature Fusion Network (DF3Net) for image inpainting based on the hierarchical atrous transformer (HAT). Specifically, the head of the dual-frequency convolution (DFC) module decouples the feature maps into low and high-frequency components. The low-frequency factor goes through the proposed HAT branch, while the high-frequency component is input into the gated convolution module branch, effectively capturing both global structural information and local texture details. Finally, the DFC tail fuses the high and low-frequency features to output the reconstructed image. Moreover, the feature fusion between high and low-frequency branches is performed layer-wise within the network, enabling mutual learning between the two branches interactively, ensuring coherence in image semantic structure and texture details. Experimental evaluations on Places2, Paris StreetView, and CelebA-HQ with different mask ratios demonstrate that the proposed method outperforms state-of-the-art methods in enhancing the structural accuracy of image inpainting. It generates semantically reasonable images with fine texture details.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.