Contrastive learning has achieved remarkable success on various high-level tasks, but there are fewer contrastive learning-based methods proposed for low-level tasks. It is challenging to adopt vanilla contrastive learning technologies proposed for high-level visual tasks to low-level image restoration problems straightly. Because the acquired high-level global visual representations are insufficient for low-level tasks requiring rich texture and context information. In this article, we investigate the contrastive learning-based single-image super-resolution (SISR) from two perspectives: positive and negative sample construction and feature embedding. The existing methods take naive sample construction approaches (e.g., considering the low-quality input as a negative sample and the ground truth as a positive sample) and adopt a prior model (e.g., pretrained very deep convolutional networks proposed by visual geometry group (VGG) model) to obtain the feature embedding. To this end, we propose a practical contrastive learning framework for SISR (PCL-SR). We involve the generation of many informative positive and hard negative samples in frequency space. Instead of utilizing an additional pretrained network, we design a simple but effective embedding network inherited from the discriminator network, which is more task-friendly. Compared with the existing benchmark methods, we retrain them by our proposed PCL-SR framework and achieve superior performance. Extensive experiments have been conducted to show the effectiveness and technical contributions of our proposed PCL-SR thorough ablation studies. The code and resulting models will be released via https://github.com/Aitical/PCL-SISR.
Read full abstract