Structure-aware Meta-fusion for Image Super-resolution
There are two main categories of image super-resolution algorithms: distortion oriented and perception oriented. Recent evidence shows that reconstruction accuracy and perceptual quality are typically in disagreement with each other. In this article, we present a new image super-resolution framework that is capable of striking a balance between distortion and perception. The core of our framework is a deep fusion network capable of generating a final high-resolution image by fusing a pair of deterministic and stochastic images using spatially varying weights. To make a single fusion model produce images with varying degrees of stochasticity, we further incorporate meta-learning into our fusion network. Once equipped with the kernel produced by a kernel prediction module, our meta fusion network is able to produce final images at any desired level of stochasticity. Experimental results indicate that our meta fusion network outperforms existing state-of-the-art SISR algorithms on widely used datasets, including PIRM-val, DIV2K-val, Set5, Set14, Urban100, Manga109, and B100. In addition, it is capable of producing high-resolution images that achieve low distortion and high perceptual quality simultaneously.
- Book Chapter
- 10.1007/978-3-030-87592-3_3
- Jan 1, 2021
This paper proposes a super-resolution (SR) method, for performing SR on a poorly-aligned dataset. Super-resolution methods commonly needs aligned low-resolution (LR) and high-resolution (HR) images for training. For obtaining paired LR and HR images in medical imaging, we need to align low and high-resolution data using image registration technology. However, since the hardness of aligning LR and HR images, the aligned LR-HR dataset is always low quality. Conventional SR methods always fail to train using poorly-aligned datasets since these methods need high-quality LR-HR datasets. To tackle this problem, we propose a two-step framework for SR using poorly-aligned datasets. In the first step, we decompose image representation into two parts: one is a content code that captures the image content; the other is a style code that captures the image style and anatomy difference between LR / HR images. To perform SR of a given LR image, we input the content code and a latent variable simultaneously into the SR network to obtain an SR result. In the second step, using the trained SR network and an LR image, we search for a content code, and a style code for generating the most proper SR image. This is conducted by searching for the best content code and the best style code by latent space exploration. We conducted experiments using a poorly-aligned clinical-micro CT lung specimen dataset. Experimental results illustrated the proposed method outperformed conventional SR methods by increasing SSIM from 0.309 to 0.312, and have much more convincing perceptual quality than conventional SR methods.
- Research Article
5
- 10.1109/tai.2024.3397292
- Sep 1, 2024
- IEEE Transactions on Artificial Intelligence
High-resolution (HR) magnetic resonance imaging is essential in aiding doctors in their diagnoses and imageguided treatments. However, acquiring HR images can be timeconsuming and costly. Consequently, deep learning-based superresolution reconstruction (SRR) has emerged as a promising solution for generating super-resolution (SR) images from lowresolution (LR) images. Unfortunately, training such neural networks requires aligned authentic HR and LR image pairs, which are challenging to obtain due to patient movements during and between image acquisitions. While rigid movements of hard tissues can be corrected with image registration, aligning deformed soft tissues is complex, making it impractical to train neural networks with authentic HR and LR image pairs. Previous studies have focused on SRR using authentic HR images and downsampled synthetic LR images. However, the difference in degradation representations between synthetic and authentic LR images suppresses the quality of SR images reconstructed from authentic LR images. To address this issue, we propose a novel Unsupervised Degradation Adaptation Network (UDEAN). Our network consists of a degradation learning network and an SRR network. The degradation learning network downsamples HR images using the degradation representation learned from misaligned or unpaired LR images. The SRR network then learns to map the downsampled HR images to the original ones. Experimental results show that our method outperforms state-ofthe-art networks with an improvement of up to 0.051/3.52 dB in SSIM/PSNR on two public datasets, thus is a promising solution to the challenges in clinical settings. Impact Statement-Acquiring precisely aligned authentic highresolution (HR) and low-resolution (LR) image pairs is considerably challenging, making supervised super-resolution (SR) reconstruction unfeasible in clinical settings. Therefore, unsupervised learning has emerged as a promising solution to this issue. This paper introduces an unsupervised network designed to be trained using unpaired or misaligned HR and LR images and enable the reconstruction of high-quality SR images. Additionally,
- Research Article
13
- 10.1016/j.aej.2024.02.007
- Feb 27, 2024
- Alexandria Engineering Journal
CN-BSRIQA: Cascaded network - blind super-resolution image quality assessment
- Conference Article
- 10.1109/iceeccot46775.2019.9114573
- Dec 1, 2019
In this paper, single- image super-resolution framework is presented by means of sparse representation. Patch-based super resolution is a technique where spatial features from a low-resolution (LR) patches are used as references for the reconstruction of high-resolution (HR) image patches. Sparse representation for each patch is extracted and these coefficients are used to recover super-resolution patch. Two dictionaries are jointly trained for the LR and HR image patches. By this, similarity between sparse representation between the LR and HR patch pair is established. Hence, the sparse representation of a LR image patch can be applied with the HR image patch dictionary to obtain a HR image patch. The dictionary thus learnt is a compact one. The experimental results demonstrate the effectiveness of the proposed algorithm.
- Conference Article
9
- 10.1109/ijcnn.2017.7965926
- May 1, 2017
Our paper is motivated from the advancement in deep learning algorithms for various computer vision problems. We are proposing a novel end-to-end deep learning based framework for image super-resolution. This framework simultaneously calculates the convolutional features of low-resolution (LR) and high-resolution (HR) image patches and learns the non-linear function that maps these convolutional features of LR image patches to their corresponding HR image patches convolutional features. Here, proposed deep learning based image super-resolution architecture is termed as coupled deep convolutional auto-encoder (CDCA) which provides state-of-the-art results. Super-resolution of a noisy/distorted LR images results in noisy/distorted HR images, as super-resolution process gives rise to spatial correlation in the noise, and further, it cannot be de-noised successfully. Traditional noise resilient image super-resolution methods utilize a de-noising algorithm prior to super-resolution but de-noising process gives rise to loss of some high-frequency information (edges and texture details) and super-resolution of the resultant image provides HR image with missing edges and texture information. We are also proposing a novel end-to-end deep learning based framework to obtain noise resilient image super-resolution. Proposed end-to-end deep learning based framework for noise resilient super-resolution simultaneously perform image de-noising and super-resolution as well as preserves textural details. First, stacked sparse de-noising auto-encoder (SSDA) was learned for LR image de-noising and proposed CDCA was learned for image superresolution. Then, both image de-noising and super-resolution networks were cascaded. This cascaded deep learning network was employed as one integral network where pre-trained weights were serving as initial weights. The integral network was end-to-end trained or fine-tuned on a database having noisy, LR image as an input and target as an HR image. In fine-tuning, all layers of the combined end-to-end network was jointly optimized to perform image de-noising and super-resolution simultaneously. Experimental results show that proposed noise resilient super-resolution framework outperforms the conventional and state-of-the-art approaches in terms of PSNR and SSIM metrics.
- Research Article
20
- 10.1007/s00371-020-01986-3
- Oct 14, 2020
- The Visual Computer
As an important display mode of underwater environments, the sonar image has limitations on the resolution, which often leads to problems with low resolution of underwater objects. Therefore, the image super-resolution algorithm is needed to transform the images from low-resolution to high-resolution. It can improve the visual effect and contribute to subsequent processing such as 3D reconstruction and object recognition. This paper proposes a method for sonar image super-resolution based on generative adversarial networks (GAN). By comparing the super-resolution effects of various interpolation and convolutional neural network algorithms on sonar images, a Residual-in-Residual Dense Block network is employed as the generator of GAN since it has the low distortion and high perceptual quality. Because the sonar image training set does not have enough data, the generator utilizes the transfer learning on the sonar images to produce an optimized network model which is more suitable for super-resolution of sonar image. The VGG19 network is employed as the discriminator. In addition, the perceptual loss is introduced into the loss function of $$\hbox {S}^2\hbox {RGAN}$$ to further improve the perceptual quality of super-resolution images. The experimental results indicate that the proposed $$\hbox {S}^2\hbox {RGAN}$$ shows excellent performance. The generated super-resolution images of $$\hbox {S}^2\hbox {RGAN}$$ have the remarkable advantages of both lower distortion and higher perceptual quality comparing with other methods. Because $$\hbox {S}^2\hbox {RGAN}$$ focuses more on the reality and overall visual effect of super-resolution sonar images, it is suitable for various underwater situations.
- Research Article
13
- 10.3390/rs15133309
- Jun 28, 2023
- Remote Sensing
Image super-resolution (SR) is a significant technique in image processing as it enhances the spatial resolution of images, enabling various downstream applications. Based on recent achievements in SR studies in computer vision, deep-learning-based SR methods have been widely investigated for remote sensing images. In this study, we proposed a two-stage approach called bicubic-downsampled low-resolution (LR) image-guided generative adversarial network (BLG-GAN) for remote sensing image super-resolution. The proposed BLG-GAN method divides the image super-resolution procedure into two stages: LR image transfer and super-resolution. In the LR image transfer stage, real-world LR images are restored to less blurry and noisy bicubic-like LR images using guidance from synthetic LR images obtained through bicubic downsampling. Subsequently, the generated bicubic-like LR images are used as inputs to the SR network, which learns the mapping between the bicubic-like LR image and the corresponding high-resolution (HR) image. By approaching the SR problem as finding optimal solutions for subproblems, the BLG-GAN achieves superior results compared to state-of-the-art models, even with a smaller overall capacity of the SR network. As the BLG-GAN utilizes a synthetic LR image as a bridge between real-world LR and HR images, the proposed method shows improved image quality compared to the SR models trained to learn the direct mapping from a real-world LR image to an HR image. Experimental results on HR satellite image datasets demonstrate the effectiveness of the proposed method in improving perceptual quality and preserving image fidelity.
- Research Article
2
- 10.7498/aps.64.114208
- Jan 1, 2015
- Acta Physica Sinica
Multi-frame super resolution reconstruction is a technology for obtaining a high resolution image from a set of blurred and aliased low resolution images. The most popular and widely used super resolution methods are motion based. However, the estimation of motion information (registration) is very difficult, computationally expensive and inaccurate, especially for aerial image. The sub-pixel registration error restricts the performance of the subsequent super resolution. Instead of trying to parameterize the motion estimation model, this paper proposes an image super resolution framework based on the polyphase components reconstruction algorithm and an improved steering kernel regression algorithm. Given an image observation model, a reversible 2D polyphase decomposition, which breaks down a high resolution image into polyphase components, is obtained. Though the assumption of diversity sampling, this paper adopts a fundamentally different approach, in which the low-resolution frames is used as the basis and the reference frame as the reference sub-polyphase component of the high resolution image for recovering the polyphase components of the high resolution image. The polyphase components, which fuse the low resolution frames with the complementary details, can be obtained by computing their expansion coefficients in terms of this basis using the available sub-polyphase components and then inversely transforming them into a high resolution image. This paper accomplishes this by formulating the problem as the maximum likelihood estimation, which guarantees a close-to-perfect solution. Furthermore, this paper proposes an improved steering kernel regression algorithm, to help restore the fusion image with mild blur and random noise. This paper adaptively refines the steering kernel regression function according to the local region context and structures. Thus, this new algorithm not only effectively combines denoising and deblurring together, but also preserves the edge information. Our framework develops an efficient and stable algorithm to tackle the huge size and ill-posedness of the super resolution problem, and improves the computational efficiency via avoiding registration and iterative computation. Several experimental results on synthetic data illustrate that our method outperforms the state-of-the-art methods in quantitative and qualitative comparisons. The proposed super resolution algorithm can indeed reconstruct high-frequency information which is otherwise unavailable in the single LR image. It can effectively suppress blur and noise, and produce visually pleasing resolution enhancement in aerial images.
- Research Article
1
- 10.1142/s0219467822500462
- Nov 3, 2021
- International Journal of Image and Graphics
Diverse image super-resolution (SR) techniques have been implemented to reconstruct the high-resolution (HR) images from input images through lower spatial resolutions. However, the evaluation of the perceptual quality of SR images remains an important and complex research problem. This paper proposes a new image SR model with the intention of attaining maximum Peak Signal-to-Noise Ratio (PSNR). The conversion of low-resolution (LR) images from the HR images is performed by bicubic interpolation-based downsampling and upsampling. Then, the four sub-bands of LR and HR images are generated by the novel Adaptive Wavelet Lifting approach, in which the filter modes are optimized using the proposed SA-CBO. From this technique, LR wavelet sub-bands (LRSB) for LR images and HR wavelet sub-bands (HRSB) for HR images are formed. With the help of the LRSB and HRSB images, the residual images are formed by the adoption of the optimized Activation function and optimized hidden neurons in a deep convolutional neural network (CNN). The improvement in both the adaptive wavelet lifting approach and deep CNN is made by the self-adaptive-colliding bodies optimization (SA-CBO). Finally, the inverse adaptive wavelet lifting approach is used to produce the final SR image. Experimental results on publicly available SR image quality databases confirm the effectiveness and generalization ability of the proposed method compared with the traditional image quality assessment algorithms.
- Conference Article
6
- 10.1109/icassp.2015.7178159
- Apr 1, 2015
Image Super-resolution (SR) reconstruction techniques based on sparse representation have attracted ever-increasing attentions in recent years, where the choice of over-complete dictionary is of prime important for reconstruction quality. However, most of the image SR methods based on sparse representation fail to consider the discrimination and the redundance of the dictionaries, which lead to obvious SR reconstruction artifacts. In this paper, we propose a novel image SR framework using coupled fisher discrimination dictionary learning (CFDDL). With CFDDL, a pair of discriminative dictionaries are first learned for the same class of high-resolution (HR) image patches and corresponding low-resolution (LR) image patches, respectively. Then, we utilize the identical sparse representation for the same class of HR and LR image patches, which can not only discover the inherent relationship between the HR and LR image patches but also enhance the computational efficiency. Extensive experiments compared with several other SR methods demonstrate the superiority of the proposed method in terms of subjective evaluation as well as objective evaluation.
- Abstract
- 10.1016/j.ijrobp.2023.06.2277
- Sep 29, 2023
- International Journal of Radiation Oncology*Biology*Physics
Accelerating Volumetric CT and MRI Imaging by Reference-Free Deep Learning Transformation from Low-Resolution to High-Resolution
- Conference Article
1
- 10.2991/isca-13.2013.55
- Jan 1, 2013
To solve the problem of super-resolution reconstruction in the two-dimensional barcode image, this paper applies the technique of super-resolution reconstruction based on sparse representation into this area. Given the characteristics of the two-dimensional barcode image, this paper presents a new approach which selects the orientation gradient and the gradient texture feature as reconstruction features for recovery. Through analyzing the edge characteristics of this kind of image, it is found that the directional derivatives of it are distinct. Thus the Krisch operator is adopted to get the edge gradient of the image as one feature for reconstruction. Besides, the edge direction texture is regarded as another feature for reconstruction because of the distinct texture directivity of the objective image. Therefore, the geometric information as well as the textural information of the image is taken into consideration for reconstruction. The experimental result shows that the proposed algorithm in this paper can effectively reconstruct the input low-resolution barcode image into the corresponding high-resolution one. What's more, compared with other similar super-resolution algorithms, this proposed algorithm improves the quality of the recovery image to a certain extent.
- Book Chapter
4
- 10.1007/978-3-030-87589-3_27
- Jan 1, 2021
Medical images routinely acquired in clinical facilities are mostly low resolution (LR), in consideration of acquisition time and efficiency. This renders challenging for clinical diagnosis of hippocampal sclerosis where additional sequences for hippocampus need to be acquired. In contrast, high-resolution (HR) images provide more detailed information for disease investigation. Recently, image super-resolution (SR) methods were proposed to reconstruct HR images from LR inputs. However, current SR methods generally use simulated LR images and intensity constraints, which limit their applications in clinical practice. To solve this problem, we utilized real paired LR and HR images and trained a Structure-Constrained Super Resolution (SCSR) network. First, we proposed a single image super-resolution framework where mixed loss functions were introduced to enhance the reconstruction of brain tissue boundaries besides intensity constraints; Second, since the structure hippocampus is relatively small, we further proposed a weight map to enhance the reconstruction of subcortical regions. Experimental results using 642 real paired cases showed that the proposed method outperformed the the-state-of-the-art methods in terms of image quality with a PSNR of 27.0405 and an SSIM of 0.9958. Also, experiments using Radiomics features extracted from hippocampus on SR images obtained through the proposed method achieved the best accuracy of 95% for differentiating subjects with left and right hippocampal sclerosis from normal controls. The proposed method shows its potential for disease screening using clinical routine images.
- Research Article
22
- 10.1109/jstars.2022.3143464
- Jan 1, 2022
- IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Due to technology and cost limitations, it is challenging to obtain high temporal and spatial resolution images from a single satellite spectrometer, which significantly limits the specific application of such remote sensing images in earth science. To solve the problem that the existing algorithms cannot effectively balance the spatial detail preservation and spectral change reconstruction, a pseudo-Siamese deep convolutional neural network (PDCNN) for spatiotemporal fusion is proposed in this article. The method proposes a pseudo-Siamese network framework model for fusion. This framework has two independent and equal feature extraction streams, but the weights are not shared. The two feature extraction streams process the image information at the previous and later moments and reconstruct the fine image of the corresponding time to fully extract the image information at different times. In the feature extraction stream, the multiscale mechanism and dilated convolution of flexible perception are designed, which can flexibly obtain feature image information and improve the model reconstruction accuracy. In addition, an attention mechanism is introduced to improve the weight of the crucial information for the remote sensing images. Adding a residual connection enhances the reuse of the initial feature information in shallow networks and reduces the loss of feature information in deep networks. Finally, the fine images obtained from the two feature extraction streams are weighted and fused to obtain the final predicted image. The subjective and objective results demonstrate that the PDCNN can effectively reconstruct the fusion image with higher quality.
- Research Article
24
- 10.1109/tmm.2022.3216115
- Jan 1, 2024
- IEEE Transactions on Multimedia
Convolutional Neural Network (CNN)-based image super-resolution (SR) has exhibited impressive success on known degraded low-resolution (LR) images. However, this type of approach is hard to hold its performance in practical scenarios when the degradation process is unknown. Despite existing blind SR methods proposed to solve this problem using blur kernel estimation, the perceptual quality and reconstruction accuracy are still unsatisfactory. In this paper, we analyze the degradation of a high-resolution (HR) image from image intrinsic components according to a degradation-based formulation model. We propose a components decomposition and co-optimization network (CDCN) for blind SR. Firstly, CDCN decomposes the input LR image into structure and detail components in feature space. Then, the mutual collaboration block (MCB) is presented to exploit the relationship between both two components. In this way, the detail component can provide informative features to enrich the structural context and the structure component can carry structural context for better detail revealing via a mutual complementary manner. After that, we present a degradation-driven learning strategy to jointly supervise the HR image detail and structure restoration process. Finally, a multi-scale fusion module followed by an upsampling layer is designed to fuse the structure and detail features and perform SR reconstruction. Empowered by such degradation-based components decomposition, collaboration, and mutual optimization, we can bridge the correlation between component learning and degradation modelling for blind SR, thereby producing SR results with more accurate textures. Extensive experiments on both synthetic SR datasets and real-world images show that the proposed method achieves the state-of-the-art performance compared to existing methods.