Image restoration (IR) involves the retrieval of missing or damaged image information and represents a significant challenge in the field of visual reconstruction. Currently, U-Net based Diffusion Models (DMs) display favorable results when utilized for IR tasks. However, the Diffusion Model (DM) based on U-Net demonstrates shortcomings in capturing the global context for IR. To address this issue, we propose a Novel Image Restoration Approach Based on U-shaped Transformer and Diffusion Models (DIRformer). DIRformer enhances the modeling capacity for long-range dependencies within DMs. In particular, DIRformer replaces the traditional U-Net downsampling with Patch merging, dedicated to improving detail preservation, and replaces upsampling with Dual up-sample, strategically designed to alleviate checkerboard artifacts. Besides, as a lightweight and versatile transformer-based solution for IR, DIRformer incorporates time and degradation mapping into the transformer design, all while preserving the fundamental U-shaped structural framework. We assess the efficacy of DIRformer in a multi-tasking IR setting across four datasets. The experimental performance illustrates that DIRformer achieves competitive performance on distortion metrics, including PSNR and SSIM. Remarkably, our proposed approach is almost 25× smaller and 2× faster than the existing methods while achieving comparable high performance.
Read full abstract