Abstract

Deep learning has been widely applied to image super-resolution (SR) tasks and has achieved superior performance over traditional methods due to its excellent feature learning capabilities. However, most of these deep learning-based methods require training image sets to pre-train SR network parameters. In this paper, we propose a new single image SR network without the need of any pre-training. The proposed network is optimized to achieve the SR reconstruction only from a low resolution observation rather than training image sets, and it focuses on improving the visual quality of reconstructed images. Specifically, we designed an attention-based decoder-encoder network for predicting the SR reconstruction, in which a residual spatial attention (RSA) unit is deployed in each layer of decoder to capture key information. Moreover, we adopt the perceptual metric consisting of L1 metric and multi-scale structural similarity (MSSSIM) metric to learn the network parameters. Different than the conventional MSE (mean squared error) metric, the perceptual metric coincides well with perceptual characteristics of the human visual system. Under the guidance of the perceptual metric, the RSA units are capable of predicting the visually sensitive areas at different scales. The proposed network can thus pay more attention to these areas for preserving visual informative structures at multiple scales. The experimental results on the Set5 and Set14 image set demonstrate that the combination of Perceptual metric and RSA units can significantly improve the reconstruction quality. In terms of PSNR and structural similarity (SSIM) values, the proposed method achieves better reconstruction results than the related works, and it is even comparable to some pre-trained networks.

Highlights

  • Single Image Super-resolution (SISR) is designed to generate a high resolution (HR) image from a single low resolution (LR) image, which has been used for a variety of vision related tasks, such as remote sensing and imaging [1], medical imaging [2], and image enhancement

  • In order to cope with these problems, we propose a perceptual metric guided deep attention network for predicting SR reconstruction

  • Some ablation studies are conducted to verify whether the attention-based network and perceptual metric are beneficial for SR reconstruction

Read more

Summary

Introduction

Single Image Super-resolution (SISR) is designed to generate a high resolution (HR) image from a single low resolution (LR) image, which has been used for a variety of vision related tasks, such as remote sensing and imaging [1], medical imaging [2], and image enhancement. A variety of SISR methods have been proposed, including prediction-based methods [3], edge-based methods [4], statistical methods [5], patch-based methods [6], sparse representation methods [7], etc These methods rely primarily on some pre-defined prior models to represent the underlying HR image, which are recognized as model-driven reconstruction methods. With the rapid development of deep learning technology, deep networks, especially convolution neural networks (CNNs), have been widely used for image generation [8] and super-resolution (SR) reconstruction [9], due to their superior performance over model-driven methods.

Related Work
Perceptual Metric Guided Deep Attention Network
Network Architecture
Loss Function
Experimental Results and Analysis
Parameters Analysis
Ablation Studies
Performance Comparison
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call