A high-frequency information guiding attention network for super-lightweight image super-resolution

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

A high-frequency information guiding attention network for super-lightweight image super-resolution

Similar Papers
  • Research Article
  • 10.21037/qims-2024-2962
Deep learning-based super-resolution method for projection image compression in radiotherapy
  • Aug 13, 2025
  • Quantitative Imaging in Medicine and Surgery
  • Zhixing Chang + 7 more

BackgroundCone-beam computed tomography (CBCT) is a three-dimensional (3D) imaging method designed for routine target verification of cancer patients during radiotherapy. The images are reconstructed from a sequence of projection images obtained by the on-board imager attached to a radiotherapy machine. CBCT images are usually stored in a health information system, but the projection images are mostly abandoned due to their massive volume. To store them economically, in this study, a deep learning (DL)-based super-resolution (SR) method for compressing the projection images was investigated.MethodsIn image compression, low-resolution (LR) images were down-sampled by a factor from the high-resolution (HR) projection images and then encoded to the video file. In image restoration, LR images were decoded from the video file and then up-sampled to HR projection images via the DL network. Three SR DL networks, convolutional neural network (CNN), residual network (ResNet), and generative adversarial network (GAN), were tested along with three video coding-decoding (CODEC) algorithms: Advanced Video Coding (AVC), High Efficiency Video Coding (HEVC), and AOMedia Video 1 (AV1). Based on the two databases of the natural and projection images, the performance of the SR networks and video codecs was evaluated with the compression ratio (CR), peak signal-to-noise ratio (PSNR), video quality metric (VQM), and structural similarity index measure (SSIM).ResultsThe codec AV1 achieved the highest CR among the three codecs. The CRs of AV1 were 13.91, 42.08, 144.32, and 289.80 for the down-sampling factor (DSF) 0 (non-SR) 2, 4, and 6, respectively. The SR network, ResNet, achieved the best restoration accuracy among the three SR networks. Its PSNRs were 69.08, 41.60, 37.08, and 32.44 dB for the four DSFs, respectively; its VQMs were 0.06%, 3.65%, 6.95%, and 13.03% for the four DSFs, respectively; and its SSIMs were 0.9984, 0.9878, 0.9798, and 0.9518 for the four DSFs, respectively. As the DSF increased, the CR increased proportionally with the modest degradation of the restored images.ConclusionsThe application of the SR model can further improve the CR based on the current result achieved by the video encoders. This compression method is not only effective for the two-dimensional (2D) projection images, but also applicable to the 3D images used in radiotherapy.

  • Conference Article
  • Cite Count Icon 5
  • 10.1109/ccdc.2019.8833181
Tunnel Pedestrian Detection Based on Super Resolution and Convolutional Neural Network
  • Jun 1, 2019
  • Zhao Min + 2 more

Tunnel pedestrian detection is of great significance to traffic safety. However, because the camera used to collect road images in the tunnel video surveillance system is often far away from the ground, the size of the pedestrian target is small. What’s more, in the special monitoring environment of the tunnel, the video image is blurred, the noise is too much, and the resolution is low, which makes the pedestrian target detection difficult. Aiming at the above problems, this paper proposes a new target detection network which cascades super-resolution and target detection networks. Firstly, for the problem that the convolutional neural network is difficult to extract features of low-resolution image, super-resolution is performed before the target detection of the image according to that the super-resolution network can increase the image resolution and enrich the image information. Then, according to the characteristics of the pedestrian target detection task and the relatively fixed size and aspect ratio of pedestrian target in the video image of tunnel, the original RPN network is improved, and a candidate box which is more suitable for the pedestrian target detection task is designed. The experimental results show that the proposed method achieves better detection results in the tunnel pedestrian target detection problem.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 6
  • 10.1109/access.2020.2971612
Part-Based Enhanced Super Resolution Network for Low-Resolution Person Re-Identification
  • Jan 1, 2020
  • IEEE Access
  • Yan Ha + 5 more

Person re-identification (REID) is an important task in video surveillance and forensics applications. Many previous works often build models on the assumption that they have same resolution cross different camera views, while it is divorced from reality. To increase the adaptability of person REID models, this paper focuses on the low-resolution person REID task to relax the impractical assumption when traditional low-resolution person REID models are under pixel-to-pixel supervision in low and high resolution pedestrian image pairs. In addition, they are easily influenced by the global background, illumination or pose variations across camera views. Therefore, we propose a Part-based Enhanced Super Resolution (PESR) network by employing a part division strategy and an enhanced generative adversarial network to boost the unpaired pedestrian image super resolution process. Specifically, the part-based super resolution network transforms low resolution image in probe into high resolution without any pixel-to-pixel supervision and the part-based synthetic feature extractor module can learn discriminative pedestrian feature representation for the generated high resolution images, which employ a part feature connection loss as constraint to conduct matching for person re-identification. Furthermore, evaluations on four public person REID datasets demonstrate the advantages of our method over the state-of-the-art ones.

  • Conference Article
  • Cite Count Icon 10
  • 10.1109/icip.2017.8296422
Image super-resolution via deep dilated convolutional networks
  • Sep 1, 2017
  • Zehao Huang + 3 more

Deep learning techniques have been successfully applied in single image super-resolution (SR). Recently, researches have shown that increasing the depth of network can significantly improve SR performance. Very deep networks for SR achieved a large improvement than former methods. However, simply increasing depths basically introduce more parameters and this lead to cumbersome computational cost. In this paper, we present a general and effective method to accelerate very deep networks for single image SR. Our method is based on dilated convolution operation, which support exponential expansion of the receptive field without increasing filter size. With the help of dilated convolution, shallow networks can achieve large receptive field and exploit contextual information in an efficient way. Based on a very deep network, we propose a 12 layers dilated convolutional network for SR (DCNSR). While accelerating 2x speed, our shallow network achieves better performance than original deep networks and shows state-of-the-art reconstructed results.

  • Research Article
  • Cite Count Icon 19
  • 10.1609/aaai.v36i10.21372
SFSRNet: Super-resolution for Single-Channel Audio Source Separation
  • Jun 28, 2022
  • Proceedings of the AAAI Conference on Artificial Intelligence
  • Joel Rixen + 1 more

The problem of single-channel audio source separation is to recover (separate) multiple audio sources that are mixed in a single-channel audio signal (e.g. people talking over each other). Some of the best performing single-channel source separation methods utilize downsampling to either make the separation process faster or make the neural networks bigger and increase accuracy. The problem concerning downsampling is that it usually results in information loss. In this paper, we tackle this problem by introducing SFSRNet which contains a super-resolution (SR) network. The SR network is trained to reconstruct the missing information in the upper frequencies of the audio signal by operating on the spectrograms of the output audio source estimations and the input audio mixture. Any separation method where the length of the sequence is a bottleneck in speed and memory can be made faster or more accurate by using the SR network. Based on the WSJ0-2mix benchmark where estimations of the audio signal of two speakers need to be extracted from the mixture, in our experiments our proposed SFSRNet reaches a scale-invariant signal-to-noise-ratio improvement (SI-SNRi) of 24.0 dB outperforming the state-of-the-art solution SepFormer which reaches an SI-SNRi of 22.3 dB.

  • Conference Article
  • Cite Count Icon 217
  • 10.1109/cvpr46437.2021.00908
Interpreting Super-Resolution Networks with Local Attribution Maps
  • Jun 1, 2021
  • Jinjin Gu + 1 more

Image super-resolution (SR) techniques have been developing rapidly, benefiting from the invention of deep networks and its successive breakthroughs. However, it is acknowledged that deep learning and deep neural networks are difficult to interpret. SR networks inherit this mysterious nature and little works make attempt to understand them. In this paper, we perform attribution analysis of SR networks, which aims at finding the input pixels that strongly influence the SR results. We propose a novel attribution approach called local attribution map (LAM), which inherits the integral gradient method yet with two unique features. One is to use the blurred image as the baseline input, and the other is to adopt the progressive blurring function as the path function. Based on LAM, we show that: (1) SR networks with a wider range of involved input pixels could achieve better performance. (2) Attention networks and non-local networks extract features from a wider range of input pixels. (3) Comparing with the range that actually contributes, the receptive field is large enough for most deep networks. (4) For SR networks, textures with regular stripes or grids are more likely to be noticed, while complex semantics are difficult to utilize. Our work opens new directions for designing SR networks and interpreting low-level vision deep models.

  • Conference Article
  • Cite Count Icon 38
  • 10.1145/3503161.3547915
RepSR: Training Efficient VGG-style Super-Resolution Networks with Structural Re-Parameterization and Batch Normalization
  • Oct 10, 2022
  • Xintao Wang + 2 more

This paper explores training efficient VGG-style super-resolution (SR) networks with the structural re-parameterization technique. The general pipeline of re-parameterization is to train networks with multi-branch topology first, and then merge them into standard 3x3 convolutions for efficient inference. In this work, we revisit those primary designs and investigate essential components for re-parameterizing SR networks. First of all, we find that batch normalization (BN) is important to bring training non-linearity and improve the final performance. However, BN is typically ignored in SR, as it usually degrades the performance and introduces unpleasant artifacts. We carefully analyze the cause of BN issue and then propose a straightforward yet effective solution. In particular, we first train SR networks with mini-batch statistics as usual, and then switch to using population statistics at the later training period. While we have successfully re-introduced BN into SR, we further design a new re-parameterizable block tailored for SR, namely RepSR. It consists of a clean residual path and two expand-and-squeeze convolution paths with the modified BN. Extensive experiments demonstrate that our simple RepSR is capable of achieving superior performance to previous SR re-parameterization methods among different model sizes. In addition, our RepSR can achieve a better trade-off between performance and actual running time (throughput) than previous SR methods. Codes are available at https://github.com/TencentARC/RepSR.

  • PDF Download Icon
  • Research Article
  • 10.3390/s23010419
The Best of Both Worlds: A Framework for Combining Degradation Prediction with High Performance Super-Resolution Networks.
  • Dec 30, 2022
  • Sensors (Basel, Switzerland)
  • Matthew Aquilina + 5 more

To date, the best-performing blind super-resolution (SR) techniques follow one of two paradigms: (A) train standard SR networks on synthetic low-resolution-high-resolution (LR-HR) pairs or (B) predict the degradations of an LR image and then use these to inform a customised SR network. Despite significant progress, subscribers to the former miss out on useful degradation information and followers of the latter rely on weaker SR networks, which are significantly outperformed by the latest architectural advancements. In this work, we present a framework for combining any blind SR prediction mechanism with any deep SR network. We show that a single lightweight metadata insertion block together with a degradation prediction mechanism can allow non-blind SR architectures to rival or outperform state-of-the-art dedicated blind SR networks. We implement various contrastive and iterative degradation prediction schemes and show they are readily compatible with high-performance SR networks such as RCAN and HAN within our framework. Furthermore, we demonstrate our framework's robustness by successfully performing blind SR on images degraded with blurring, noise and compression. This represents the first explicit combined blind prediction and SR of images degraded with such a complex pipeline, acting as a baseline for further advancements.

  • Research Article
  • 10.3233/jcm-226653
Efficient image compression method using image super-resolution residual learning network
  • May 30, 2023
  • Journal of Computational Methods in Sciences and Engineering
  • Jianhua Hu + 7 more

With the rapid growth of Internet video image information, there is a large amount of redundancy in image data. Use less data stream information to transfer the image or the amount of information contained in the image. Its purpose is to reduce the redundancy of images, so as to store them at low bit rate and reduce the data storage space. In the general image compression method, the hybrid coding framework is adopted. Each algorithm adopts a fixed algorithm mode through a specific design algorithm, without global optimization. Image compression is mainly divided into prediction, transformation, quantization, digital entropy coding and other steps. At present, there are many researches on super-resolution network based on deep learning technology. The main function is to reconstruct high-resolution image replace image magnification low-resolution images such as linear interpolation, which has a great performance improvement image resolution, noise reduction, deblurring and so on, but there is no effective way to use super-resolution network applications to improve quality of compression reconstructed image quality. This paper involves a new method that using image super-resolution residual learning network to improve quality of compression image, our method, the reduced image is encoded into a content stream and a transmission corresponding parameter is encoded into a model stream. Firstly, the original image is scaled down 1/2 size of source image, then encode the small image into content stream with the existing codec. Secondly, the residual learning super-resolution (SR) network is used for image filtering to scale up reconstructed image with decode image resizing method and boost the quality of edge feature extraction of image. Our results show that there is significant performance improvement of h265 in low resolution reconstructed image (bits-per-pixel less than 0.1).

  • Conference Article
  • Cite Count Icon 156
  • 10.1109/cvpr46437.2021.01184
ClassSR: A General Framework to Accelerate Super-Resolution Networks by Data Characteristic
  • Jun 1, 2021
  • Xiangtao Kong + 3 more

We aim at accelerating super-resolution (SR) networks on large images (2K-8K). The large images are usually decomposed into small sub-images in practical usages. Based on this processing, we found that different image regions have different restoration difficulties and can be processed by networks with different capacities. Intuitively, smooth areas are easier to super-solve than complex textures. To utilize this property, we can adopt appropriate SR networks to process different sub-images after the decomposition. On this basis, we propose a new solution pipeline – ClassSR that combines classification and SR in a unified framework. In particular, it first uses a Class-Module to classify the subimages into different classes according to restoration difficulties, then applies an SR-Module to perform SR for different classes. The Class-Module is a conventional classification network, while the SR-Module is a network container that consists of the to-be-accelerated SR network and its simplified versions. We further introduce a new classification method with two losses – Class-Loss and Average-Loss to produce the classification results. After joint training, a majority of sub-images will pass through smaller networks, thus the computational cost can be significantly reduced. Experiments show that our ClassSR can help most existing methods (e.g., FSRCNN, CARN, SRResNet, RCAN) save up to 50% FLOPs on DIV8K datasets. This general framework can also be applied in other low-level vision tasks.

  • Book Chapter
  • Cite Count Icon 6
  • 10.1007/978-3-030-66415-2_11
Robust Super-Resolution of Real Faces Using Smooth Features
  • Jan 1, 2020
  • Saurabh Goswami + 2 more

Real low-resolution (LR) face images contain degradations which are too varied and complex to be captured by known downsampling kernels and signal-independent noises. So, in order to successfully super-resolve real faces, a method needs to be robust to a wide range of noise, blur, compression artifacts etc. Some of the recent works attempt to model these degradations from a dataset of real images using a Generative Adversarial Network (GAN). They generate synthetically degraded LR images and use them with corresponding real high-resolution (HR) image to train a super-resolution (SR) network using a combination of a pixel-wise loss and an adversarial loss. In this paper, we propose a two module super-resolution network where the feature extractor module extracts robust features from the LR image, and the SR module generates an HR estimate using only these robust features. We train a degradation GAN to convert bicubically downsampled clean images to real degraded images, and interpolate between the obtained degraded LR image and its clean LR counterpart. This interpolated LR image is then used along with it’s corresponding HR counterpart to train the super-resolution network from end to end. Entropy Regularized Wasserstein Divergence is used to force the encoded features learnt from the clean and degraded images to closely resemble those extracted from the interpolated image to ensure robustness.

  • Research Article
  • Cite Count Icon 14
  • 10.1016/j.cmpb.2020.105615
CT kernel conversions using convolutional neural net for super-resolution with simplified squeeze-and-excitation blocks and progressive learning among smooth and sharp kernels
  • Jun 20, 2020
  • Computer Methods and Programs in Biomedicine
  • Da-In Eun + 5 more

CT kernel conversions using convolutional neural net for super-resolution with simplified squeeze-and-excitation blocks and progressive learning among smooth and sharp kernels

  • Conference Article
  • 10.1109/cac53003.2021.9727717
Detection and Mapping of an Uncooperative Spinning Target under Low-light Illumination Condition
  • Oct 22, 2021
  • Jinzhen Mu + 4 more

This paper investigates the simultaneous detection and mapping problem for inspecting an unknown and uncooperative target that is spinning in space. First, we apply a new unsupervised generative adversarial network (GAN) to enhance the low contrast and poor visibility images which are captured in low-light space illumination conditions. Second, due to the captured low-resolution (LR) images of the target contain small size of key-components, so that we propose a new small object detection network that combines a GAN-based super-resolution (SR) network and a FRCNN-based detection network to locate these objects. The SR network was used to reconstruct super-resolved images from the original LR images. Third, we utilize a SLAM-based algorithm to map and estimate the pose of the spinning target based on previous image enhancement. In summary, the integrated architecture has three components: a low-light enhancement GAN, a small object detection network, and a real-time SLAM system. The experimental results show that the integrated architecture achieves better visual quality and improves the awareness of an uncooperative spinning target.

  • Book Chapter
  • Cite Count Icon 21
  • 10.1007/978-3-030-66415-2_31
W2S: Microscopy Data with Joint Denoising and Super-Resolution for Widefield to SIM Mapping
  • Jan 1, 2020
  • Ruofan Zhou + 5 more

In fluorescence microscopy live-cell imaging, there is a critical trade-off between the signal-to-noise ratio and spatial resolution on one side, and the integrity of the biological sample on the other side. To obtain clean high-resolution (HR) images, one can either use microscopy techniques, such as structured-illumination microscopy (SIM), or apply denoising and super-resolution (SR) algorithms. However, the former option requires multiple shots that can damage the samples, and although efficient deep learning based algorithms exist for the latter option, no benchmark exists to evaluate these algorithms on the joint denoising and SR (JDSR) tasks. To study JDSR on microscopy data, we propose such a novel JDSR dataset, Widefield2SIM (W2S), acquired using a conventional fluorescence widefield and SIM imaging. W2S includes 144,000 real fluorescence microscopy images, resulting in a total of 360 sets of images. A set is comprised of noisy low-resolution (LR) widefield images with different noise levels, a noise-free LR image, and a corresponding high-quality HR SIM image. W2S allows us to benchmark the combinations of 6 denoising methods and 6 SR methods. We show that state-of-the-art SR networks perform very poorly on noisy inputs. Our evaluation also reveals that applying the best denoiser in terms of reconstruction error followed by the best SR method does not necessarily yield the best final result. Both quantitative and qualitative results show that SR networks are sensitive to noise and the sequential application of denoising and SR algorithms is sub-optimal. Lastly, we demonstrate that SR networks retrained end-to-end for JDSR outperform any combination of state-of-the-art deep denoising and SR networks

  • Conference Article
  • Cite Count Icon 8
  • 10.1109/iscas45731.2020.9180822
EFFRBNet: A Deep Super Resolution Network using Edge-Assisted Feature Fusion Residual Blocks
  • Oct 1, 2020
  • Alireza Esmaeilzehi + 2 more

Deep convolutional networks provide very high quality super resolution images through a learning process by a nonlinear end-to-end mapping between low and high resolution images. Many of the state-of-the-art super resolution networks employ residual blocks in their network architectures, where in each residual block the high frequency residual signals are added to the feature maps input to the block. In this paper, a new residual block is proposed for the problem of image super resolution. The proposed residual block consists of three modules, namely, feature transformation module, nonlinear edge extraction module and feature fusion module. The feature transformation module produces high frequency residual signals and the nonlinear edge extraction module extracts the edges of the features input to the block. These generated high frequency features are then fused using the feature fusion module in order to produce a very rich set of high frequency residual features. The performance of the super resolution network using the proposed residual block is compared with that of the state-of-the-art light-weight super resolution schemes on four benchmark datasets. It is shown that the proposed super resolution scheme outperforms the state-of-the-art light-weight super resolution networks, when both the performance and number of parameters of the network are simultaneously taken into consideration.

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.