Lightweight Recurrent Neural Network for Image Super-Resolution

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

In recent years, significant progress has been made in image super-resolution through the use of large-scale models. However, the efficacy of these models comes at the cost of their substantial size, posing challenges and limitations when deploying them on resource-constrained devices. Despite their remarkable performance, the feasibility of employing such models on low-end devices has remained a contentious topic. In light of this, our research introduces a lightweight approach to image super-resolution, leveraging a simple recurrent neural network architecture consisting of a recurrent convolution block. Our proposed model uses less than 75k parameters, which is 10 times fewer than the state-of-the-art transformer-based super-resolution model. Despite its small size, the proposed model performs well in image super-resolution tasks both visually and quantitatively. Our work presents a promising direction for addressing the difficulty of deploying efficient super-resolution models on resource-limited devices.

Similar Papers
  • Conference Article
  • Cite Count Icon 29
  • 10.1109/wacv.2018.00160
CT-SRCNN: Cascade Trained and Trimmed Deep Convolutional Neural Networks for Image Super Resolution
  • Mar 1, 2018
  • Haoyu Ren + 2 more

We propose methodologies to train highly accurate and efficient deep convolutional neural networks (CNNs) for image super resolution (SR). A cascade training approach to deep learning is proposed to improve the accuracy of the neural networks while gradually increasing the number of network layers. Next, we explore how to improve the SR efficiency by making the network slimmer. Two methodologies, the one-shot trimming and the cascade trimming, are proposed. With the cascade trimming, the network's size is gradually reduced layer by layer, without significant loss on its discriminative ability. Experiments on benchmark image datasets show that our proposed SR network achieves the state-of-the-art super resolution accuracy, while being more than 4 times faster compared to existing deep super resolution networks.

  • Research Article
  • Cite Count Icon 1
  • 10.21037/qims-2024-2962
Deep learning-based super-resolution method for projection image compression in radiotherapy
  • Aug 13, 2025
  • Quantitative Imaging in Medicine and Surgery
  • Zhixing Chang + 7 more

BackgroundCone-beam computed tomography (CBCT) is a three-dimensional (3D) imaging method designed for routine target verification of cancer patients during radiotherapy. The images are reconstructed from a sequence of projection images obtained by the on-board imager attached to a radiotherapy machine. CBCT images are usually stored in a health information system, but the projection images are mostly abandoned due to their massive volume. To store them economically, in this study, a deep learning (DL)-based super-resolution (SR) method for compressing the projection images was investigated.MethodsIn image compression, low-resolution (LR) images were down-sampled by a factor from the high-resolution (HR) projection images and then encoded to the video file. In image restoration, LR images were decoded from the video file and then up-sampled to HR projection images via the DL network. Three SR DL networks, convolutional neural network (CNN), residual network (ResNet), and generative adversarial network (GAN), were tested along with three video coding-decoding (CODEC) algorithms: Advanced Video Coding (AVC), High Efficiency Video Coding (HEVC), and AOMedia Video 1 (AV1). Based on the two databases of the natural and projection images, the performance of the SR networks and video codecs was evaluated with the compression ratio (CR), peak signal-to-noise ratio (PSNR), video quality metric (VQM), and structural similarity index measure (SSIM).ResultsThe codec AV1 achieved the highest CR among the three codecs. The CRs of AV1 were 13.91, 42.08, 144.32, and 289.80 for the down-sampling factor (DSF) 0 (non-SR) 2, 4, and 6, respectively. The SR network, ResNet, achieved the best restoration accuracy among the three SR networks. Its PSNRs were 69.08, 41.60, 37.08, and 32.44 dB for the four DSFs, respectively; its VQMs were 0.06%, 3.65%, 6.95%, and 13.03% for the four DSFs, respectively; and its SSIMs were 0.9984, 0.9878, 0.9798, and 0.9518 for the four DSFs, respectively. As the DSF increased, the CR increased proportionally with the modest degradation of the restored images.ConclusionsThe application of the SR model can further improve the CR based on the current result achieved by the video encoders. This compression method is not only effective for the two-dimensional (2D) projection images, but also applicable to the 3D images used in radiotherapy.

  • Research Article
  • Cite Count Icon 22
  • 10.1109/tbc.2021.3126275
SRNMSM: A Deep Light-Weight Image Super Resolution Network Using Multi-Scale Spatial and Morphological Feature Generating Residual Blocks
  • Mar 1, 2022
  • IEEE Transactions on Broadcasting
  • Alireza Esmaeilzehi + 2 more

Generating features representing the textures and structures of an image is very important characteristic of a super resolution network. Morphological operations are the nonlinear mathematical operations that can process signals focusing on their structures and textures. In this paper, we propose a novel residual block to generate and process morphological features and fuse them with the conventional spatial features, in order to produce a very rich and highly representational set of residual feature maps. The proposed residual block is then used in a deep convolutional neural network for the task of image super resolution. It is shown that the capability of the proposed block in generating and using the morphological features can significantly improve the super resolution performance of a deep network. The super resolution network employing the proposed residual block is shown to outperform the state-of-the-art low-complexity image super resolution networks on various benchmark datasets.

  • Conference Article
  • Cite Count Icon 6
  • 10.1109/ccdc.2019.8833181
Tunnel Pedestrian Detection Based on Super Resolution and Convolutional Neural Network
  • Jun 1, 2019
  • Zhao Min + 2 more

Tunnel pedestrian detection is of great significance to traffic safety. However, because the camera used to collect road images in the tunnel video surveillance system is often far away from the ground, the size of the pedestrian target is small. What’s more, in the special monitoring environment of the tunnel, the video image is blurred, the noise is too much, and the resolution is low, which makes the pedestrian target detection difficult. Aiming at the above problems, this paper proposes a new target detection network which cascades super-resolution and target detection networks. Firstly, for the problem that the convolutional neural network is difficult to extract features of low-resolution image, super-resolution is performed before the target detection of the image according to that the super-resolution network can increase the image resolution and enrich the image information. Then, according to the characteristics of the pedestrian target detection task and the relatively fixed size and aspect ratio of pedestrian target in the video image of tunnel, the original RPN network is improved, and a candidate box which is more suitable for the pedestrian target detection task is designed. The experimental results show that the proposed method achieves better detection results in the tunnel pedestrian target detection problem.

  • Research Article
  • Cite Count Icon 26
  • 10.1609/aaai.v36i10.21372
SFSRNet: Super-resolution for Single-Channel Audio Source Separation
  • Jun 28, 2022
  • Proceedings of the AAAI Conference on Artificial Intelligence
  • Joel Rixen + 1 more

The problem of single-channel audio source separation is to recover (separate) multiple audio sources that are mixed in a single-channel audio signal (e.g. people talking over each other). Some of the best performing single-channel source separation methods utilize downsampling to either make the separation process faster or make the neural networks bigger and increase accuracy. The problem concerning downsampling is that it usually results in information loss. In this paper, we tackle this problem by introducing SFSRNet which contains a super-resolution (SR) network. The SR network is trained to reconstruct the missing information in the upper frequencies of the audio signal by operating on the spectrograms of the output audio source estimations and the input audio mixture. Any separation method where the length of the sequence is a bottleneck in speed and memory can be made faster or more accurate by using the SR network. Based on the WSJ0-2mix benchmark where estimations of the audio signal of two speakers need to be extracted from the mixture, in our experiments our proposed SFSRNet reaches a scale-invariant signal-to-noise-ratio improvement (SI-SNRi) of 24.0 dB outperforming the state-of-the-art solution SepFormer which reaches an SI-SNRi of 22.3 dB.

  • Research Article
  • Cite Count Icon 7
  • 10.1109/access.2020.2971612
Part-Based Enhanced Super Resolution Network for Low-Resolution Person Re-Identification
  • Jan 1, 2020
  • IEEE Access
  • Yan Ha + 5 more

Person re-identification (REID) is an important task in video surveillance and forensics applications. Many previous works often build models on the assumption that they have same resolution cross different camera views, while it is divorced from reality. To increase the adaptability of person REID models, this paper focuses on the low-resolution person REID task to relax the impractical assumption when traditional low-resolution person REID models are under pixel-to-pixel supervision in low and high resolution pedestrian image pairs. In addition, they are easily influenced by the global background, illumination or pose variations across camera views. Therefore, we propose a Part-based Enhanced Super Resolution (PESR) network by employing a part division strategy and an enhanced generative adversarial network to boost the unpaired pedestrian image super resolution process. Specifically, the part-based super resolution network transforms low resolution image in probe into high resolution without any pixel-to-pixel supervision and the part-based synthetic feature extractor module can learn discriminative pedestrian feature representation for the generated high resolution images, which employ a part feature connection loss as constraint to conduct matching for person re-identification. Furthermore, evaluations on four public person REID datasets demonstrate the advantages of our method over the state-of-the-art ones.

  • Conference Article
  • Cite Count Icon 16
  • 10.1109/icip.2017.8296422
Image super-resolution via deep dilated convolutional networks
  • Sep 1, 2017
  • Zehao Huang + 3 more

Deep learning techniques have been successfully applied in single image super-resolution (SR). Recently, researches have shown that increasing the depth of network can significantly improve SR performance. Very deep networks for SR achieved a large improvement than former methods. However, simply increasing depths basically introduce more parameters and this lead to cumbersome computational cost. In this paper, we present a general and effective method to accelerate very deep networks for single image SR. Our method is based on dilated convolution operation, which support exponential expansion of the receptive field without increasing filter size. With the help of dilated convolution, shallow networks can achieve large receptive field and exploit contextual information in an efficient way. Based on a very deep network, we propose a 12 layers dilated convolutional network for SR (DCNSR). While accelerating 2x speed, our shallow network achieves better performance than original deep networks and shows state-of-the-art reconstructed results.

  • Book Chapter
  • Cite Count Icon 2
  • 10.1007/978-981-16-5940-9_25
Adaptive Densely Residual Network for Image Super-Resolution
  • Jan 1, 2021
  • Wen Zhao

Many networks are designed to stack a large number of residual blocks, deepen the network and improve network performance through short residual connec-tion, long residual connection, and dense connection. However, without consider-ing different contributions of different depth features to the network, these de-signs have the problem of evaluating the importance of different depth features. To solve this problem, this paper proposes an adaptive densely residual net-work (ADRNet) for the single image super resolution. ADRN realizes the evalua-tion of distributions of different depth features and learns more representative features. An adaptive densely residual block (ADRB) was designed, combining 3 residual blocks (RB) and dense connection was added. It learned the attention score of each dense connection through adaptive dense connections, and the at-tention score reflected the importance of the features of each RB. To further en-hance the performance of ADRB, a multi-direction attention block (MDAB) was introduced to obtain multi-directional context information. Through comparative experiments, it is proved that theproposed ADRNet is superior to the existing methods. Through ablation experiments, it is proved that evaluating features of different depths helps to improve network performance.

  • Conference Article
  • Cite Count Icon 44
  • 10.1145/3503161.3547915
RepSR: Training Efficient VGG-style Super-Resolution Networks with Structural Re-Parameterization and Batch Normalization
  • Oct 10, 2022
  • Xintao Wang + 2 more

This paper explores training efficient VGG-style super-resolution (SR) networks with the structural re-parameterization technique. The general pipeline of re-parameterization is to train networks with multi-branch topology first, and then merge them into standard 3x3 convolutions for efficient inference. In this work, we revisit those primary designs and investigate essential components for re-parameterizing SR networks. First of all, we find that batch normalization (BN) is important to bring training non-linearity and improve the final performance. However, BN is typically ignored in SR, as it usually degrades the performance and introduces unpleasant artifacts. We carefully analyze the cause of BN issue and then propose a straightforward yet effective solution. In particular, we first train SR networks with mini-batch statistics as usual, and then switch to using population statistics at the later training period. While we have successfully re-introduced BN into SR, we further design a new re-parameterizable block tailored for SR, namely RepSR. It consists of a clean residual path and two expand-and-squeeze convolution paths with the modified BN. Extensive experiments demonstrate that our simple RepSR is capable of achieving superior performance to previous SR re-parameterization methods among different model sizes. In addition, our RepSR can achieve a better trade-off between performance and actual running time (throughput) than previous SR methods. Codes are available at https://github.com/TencentARC/RepSR.

  • Book Chapter
  • Cite Count Icon 34
  • 10.1007/978-3-030-31726-3_23
ADSRNet: Attention-Based Densely Connected Network for Image Super-Resolution
  • Jan 1, 2019
  • Weiqi Li + 4 more

Densely connected network for Image Super-Resolution (SR) has achieved much better results than most of the other methods owing to its dense connection architecture which can provide more and deeper features for image super-resolution. However, since the dense block accepts the outputs of all previous blocks, it receives a lot of redundant and conflicting information, which results in longer training time and bad super-resolution reconstruction results. To solve this problem, we introduce an attention module into a densely connected network and propose an attention-based densely connected network (ADSRNet) for image super-resolution. With the attention module, our ADSRNet can select more important information and cut off those redundant for image super-resolution from a large number of feature maps by importance ordering. Thus, we can speed up the training of network. Extensive experiments are performed over the datasets Set5, Set14 and BSD100, the qualitatively and quantitatively evaluated results for our proposed ADSRNet are better than ones of some state-of-the-art methods.

  • PDF Download Icon
  • Research Article
  • 10.3390/s23010419
The Best of Both Worlds: A Framework for Combining Degradation Prediction with High Performance Super-Resolution Networks.
  • Dec 30, 2022
  • Sensors (Basel, Switzerland)
  • Matthew Aquilina + 5 more

To date, the best-performing blind super-resolution (SR) techniques follow one of two paradigms: (A) train standard SR networks on synthetic low-resolution-high-resolution (LR-HR) pairs or (B) predict the degradations of an LR image and then use these to inform a customised SR network. Despite significant progress, subscribers to the former miss out on useful degradation information and followers of the latter rely on weaker SR networks, which are significantly outperformed by the latest architectural advancements. In this work, we present a framework for combining any blind SR prediction mechanism with any deep SR network. We show that a single lightweight metadata insertion block together with a degradation prediction mechanism can allow non-blind SR architectures to rival or outperform state-of-the-art dedicated blind SR networks. We implement various contrastive and iterative degradation prediction schemes and show they are readily compatible with high-performance SR networks such as RCAN and HAN within our framework. Furthermore, we demonstrate our framework's robustness by successfully performing blind SR on images degraded with blurring, noise and compression. This represents the first explicit combined blind prediction and SR of images degraded with such a complex pipeline, acting as a baseline for further advancements.

  • Research Article
  • Cite Count Icon 33
  • 10.1016/j.neucom.2017.08.041
A two-channel convolutional neural network for image super-resolution
  • Sep 13, 2017
  • Neurocomputing
  • Sumei Li + 4 more

A two-channel convolutional neural network for image super-resolution

  • Conference Article
  • Cite Count Icon 1
  • 10.1109/apsar46974.2019.9048251
Automatic Target Recognition for Low-Resolution SAR Images Based on Super-Resolution Network
  • Nov 1, 2019
  • Shuang Yang + 2 more

Synthetic aperture radar (SAR) automatic target recognition (ATR) is one of the hottest issue in current research because of its wide application value. However, the low-resolution SAR images will decline the recognition accuracy of targets due to its obscure characteristic, and meanwhile it is difficult to acquire a great number of high-resolution SAR images for extracting clear characteristic. To solve these problems, this paper proposes a method of ATR for low-resolution SAR images based on super-resolution network. Super-resolution generative adversarial network (SRGAN) and deep convolutional neural network (DCNN) are utilized for extracting characteristic and classification, respectively. The segmented low-resolution SAR images are enhanced through SRGAN to improve the visual resolution and the feature characterization ability of target in SAR image; Then the enhanced SAR images are classified automatically by DCNN. Finally, the effectiveness and the efficiency are verified on the open data set, moving and stationary target acquisition and recognition (MSTAR).

  • Book Chapter
  • Cite Count Icon 1
  • 10.1007/978-981-19-2266-4_30
Efficient Wavelet Channel Attention Module with a Fusion Network for Image Super-Resolution
  • Jan 1, 2022
  • Xiyu Han + 4 more

In recent years, deep convolutional neural networks (CNNs) have been generally used in image Super-Resolution (SR) and has made great progress. Nevertheless, the existing CNNs based SR method cannot fully search the background information in the step of feature extraction. Moreover, the later network pursues too much deeper and heavier, thus ignoring the desired performance of SR. To solve the problem, we project a learning wavelets and channel attention network (LWCAN) for image SR. The network mainly comprises three branches. The first part extracts the low-level feature from the input image through two convolution layers and the Efficient Channel Attention (ECA) block. The second part is calculating the second level low frequency wavelet coefficient. The third part is used for forecasting the residual frequency bands of wavelet coefficient. Finally, the reverse wavelet transform is used to reconstruct the SR image from these coefficients. Experiments on the common used dataset prove the effectiveness of our projected LWCAN in the light of visual effects and quantitative metrics.KeywordsCNNsImage super-resolutionEfficient channel attentionWavelet transform

  • Conference Article
  • Cite Count Icon 3
  • 10.1109/icassp39728.2021.9414527
Lightweight Non-Local Network for Image Super-Resolution
  • Jun 6, 2021
  • Risheng Wang + 5 more

The popular deep convolutional networks used for image super-resolution (SR) reconstruction often increase the network depth and employ attention mechanism to improve image reconstruction effect. However, these networks suffer from two problems. The first is the deeper network easily causes higher computational cost and more GPU memory usage. The second is traditional attention mechanism often misses the spatial information of images leading the loss of image detail information. To address these issues, we propose a lightweight non-local network (LNLN) for image super resolution in this paper. The proposed network makes two contributions. First, we use non-local module instead of normal attention module to obtain larger receptive field and extract more comprehensive feature information, which is helpful for improving image SR reconstruction results. Secondly, we use the depthwise separable convolution (DSC) instead of the vanilla convolution to reconstruct the residual block, which greatly reduces the number of parameters and computational cost. The proposed LNLN and comparative networks are evaluated on five commonly public datasets, and experiments demonstrate that the proposed LNLN is superior to state-of-the-art networks in terms of reconstruction performance, the number of parameters and storage space.

Save Icon
Up Arrow
Open/Close