Single Image Super-Resolution (SISR) aims to reconstruct a high-resolution image from its corresponding low-resolution input. A common technique to enhance the reconstruction quality is Non-Local Attention (NLA), which leverages self-similar texture patterns in images. However, we have made a novel finding that challenges the prevailing wisdom. Our research reveals that NLA can be detrimental to SISR and even produce severely distorted textures. For example, when dealing with severely degrade textures, NLA may generate unrealistic results due to the inconsistency of non-local texture patterns. This problem is overlooked by existing works, which only measure the average reconstruction quality of the whole image, without considering the potential risks of using NLA. To address this issue, we propose a new perspective for evaluating the reconstruction quality of NLA, by focusing on the sub-pixel level that matches the pixel-wise fusion manner of NLA. From this perspective, we provide the approximate reconstruction performance upper bound of NLA, which guides us to design a concise yet effective Texture-Fidelity Strategy (TFS) to mitigate the degradation caused by NLA. Moreover, the proposed TFS can be conveniently integrated into existing NLA-based SISR models as a general building block. Based on the TFS, we develop a Deep Texture-Fidelity Network (DTFN), which achieves state-of-the-art performance for SISR. Our code and a pre-trained DTFN are available on GitHub† for verification.
Read full abstract