Re-Boosting Self-Collaboration Parallel Prompt GAN for Unsupervised Image Restoration.
Deep learning methods have demonstrated state-of-the-art performance in image restoration, especially when trained on large-scale paired datasets. However, acquiring paired data in real-world scenarios poses a significant challenge. Unsupervised restoration approaches based on generative adversarial networks (GANs) offer a promising solution without requiring paired datasets. Yet, these GAN-based approaches struggle to surpass the performance of conventional unsupervised GAN-based frameworks without significantly modifying model structures or increasing the computational complexity. To address these issues, we propose a self-collaboration (SC) strategy for existing restoration models. This strategy utilizes information from the previous stage as feedback to guide subsequent stages, achieving significant performance improvement without increasing the framework's inference complexity. The SC strategy comprises a prompt learning (PL) module and a restorer ($Res$Res). It iteratively replaces the previous less powerful fixed restorer $\overline{Res}$Res¯ in the PL module with a more powerful $Res$Res. The enhanced PL module generates better pseudo-degraded/clean image pairs, leading to a more powerful $Res$Res for the next iteration. Our SC can significantly improve the $Res$Res 's performance by over 1.5 dB without adding extra parameters or computational complexity during inference. Meanwhile, existing self-ensemble (SE) and our SC strategies enhance the performance of pre-trained restorers from different perspectives. As SE increases computational complexity during inference, we propose a re-boosting module to the SC (Reb-SC) to improve the SC strategy further by incorporating SE into SC without increasing inference time. This approach further enhances the restorer's performance by approximately 0.3 dB. Additionally, we present a baseline framework that includes parallel generative adversarial branches with complementary "self-synthesis" and "unpaired-synthesis" constraints, ensuring the effectiveness of the training framework. Extensive experimental results on restoration tasks demonstrate that the proposed model performs favorably against existing state-of-the-art unsupervised restoration methods.
- Conference Article
2
- 10.1109/iscas.1996.539896
- May 12, 1996
The influence of the multilayer perceptron's output dimension on its performance in digital image restoration is studied. We present experimental results for images degraded with Gaussian noise, mixed (Gaussian and impulsive) noise, and pure impulsive noise, together with preliminary conclusions concerning improvement of multilayer perceptron's performance in the image restoration task.
- Research Article
14
- 10.3390/cancers14010040
- Dec 23, 2021
- Cancers
Simple SummaryMRI-only simulation in radiation therapy (RT) planning has received attention because the CT scan can be omitted. For MRI-only simulation, synthetic CT (sCT) is necessary for the dose calculation. Various methodologies have been suggested for the generation of sCT and, recently, methods using the deep learning approaches are actively investigated. GAN and cycle-consistent GAN (CycGAN) have been mainly tested, however, very limited studies compared the qualities of sCTs generated from these methods or suggested other models for sCT generation. We have compared GAN, CycGAN, and, reference-guided GAN (RgGAN), a new model of deep learning method. We found that the performance in the HU conservation for soft tissue was poorest for GAN. All methods could generate sCTs feasible for VMAT planning with the trend that sCT generated from the RgGAN showed best performance in dosimetric conservation D98% and D95% than sCTs from other methodologies.We aimed to evaluate and compare the qualities of synthetic computed tomography (sCT) generated by various deep-learning methods in volumetric modulated arc therapy (VMAT) planning for prostate cancer. Simulation computed tomography (CT) and T2-weighted simulation magnetic resonance image from 113 patients were used in the sCT generation by three deep-learning approaches: generative adversarial network (GAN), cycle-consistent GAN (CycGAN), and reference-guided CycGAN (RgGAN), a new model which performed further adjustment of sCTs generated by CycGAN with available paired images. VMAT plans on the original simulation CT images were recalculated on the sCTs and the dosimetric differences were evaluated. For soft tissue, a significant difference in the mean Hounsfield unites (HUs) was observed between the original CT images and only sCTs from GAN (p = 0.03). The mean relative dose differences for planning target volumes or organs at risk were within 2% among the sCTs from the three deep-learning approaches. The differences in dosimetric parameters for D98% and D95% from original CT were lowest in sCT from RgGAN. In conclusion, HU conservation for soft tissue was poorest for GAN. There was the trend that sCT generated from the RgGAN showed best performance in dosimetric conservation D98% and D95% than sCTs from other methodologies.
- Research Article
5
- 10.1007/s00500-021-06049-w
- Aug 3, 2021
- Soft Computing
Image dehazing has always been a challenging topic in image processing. The development of deep learning methods, especially the generative adversarial networks (GAN), provides a new way for image dehazing. In recent years, many deep learning methods based on GAN have been applied to image dehazing. However, GAN has two problems in image dehazing. Firstly, For haze image, haze not only reduces the quality of the image but also blurs the details of the image. For GAN network, it is difficult for the generator to restore the details of the whole image while removing the haze. Secondly, GAN model is defined as a minimax problem, which weakens the loss function. It is difficult to distinguish whether GAN is making progress in the training process. Therefore, we propose a guided generative adversarial dehazing network (GGADN). Different from other generation adversarial networks, GGADN adds a guided module on the generator. The guided module verifies the network of each layer of the generator. At the same time, the details of the map generated by each layer are strengthened. Network training is based on the pre-trained VGG feature model and L1-regularized gradient prior which is developed by new loss function parameters. From the dehazing results of synthetic images and real images, the proposed method is better than the state-of-the-art dehazing methods.
- Research Article
- 10.11834/jig.211139
- Jan 1, 2023
- Journal of Image and Graphics
随着大数据的普及和算力的提升,深度学习已成为一个热门研究领域,但其强大的性能过分依赖网络结构和参数设置。因此,如何在提高模型性能的同时降低模型的复杂度,关键在于模型优化。为了更加精简地描述优化问题,本文以有监督深度学习作为切入点,对其提升拟合能力和泛化能力的优化方法进行归纳分析。给出优化的基本公式并阐述其核心;其次,从拟合能力的角度将优化问题分解为3个优化方向,即收敛性、收敛速度和全局质量问题,并总结分析这3个优化方向中的具体方法与研究成果;从提升模型泛化能力的角度出发,分为数据预处理和模型参数限制两类对正则化方法的研究现状进行梳理;结合上述理论基础,以生成对抗网络(generative adversarialnetwork,GAN)变体模型的发展历程为主线,回顾各种优化方法在该领域的应用,并基于实验结果对优化效果进行比较和分析,进一步给出几种在GAN领域效果较好的优化策略。现阶段,各种优化方法已普遍应用于深度学习模型,能够较好地提升模型的拟合能力,同时通过正则化缓解模型过拟合问题来提高模型的鲁棒性。尽管深度学习的优化领域已得到广泛研究,但仍缺少成熟的系统性理论来指导优化方法的使用,且存在几个优化问题有待进一步研究,包括无法保证全局梯度的Lipschitz限制、在GAN中找寻稳定的全局最优解,以及优化方法的可解释性缺乏严格的理论证明。;Deep learning technique has been developing intensively in big data era. However,its capability is still challenged for the design of network structure and parameter setting. Therefore,it is essential to improve the performance of the model and optimize the complexity of the model. Machine learning can be segmented into five categories in terms of learning methods:1)supervised learning,2)unsupervised learning,3)semi-supervised learning,4)deep learning,and 5) reinforcement learning. These machine learning techniques are required to be incorporated in. To improve its fitting and generalization ability,we select supervised deep learning as a niche to summarize and analyze the optimization methods. First,the mechanism of optimization is demonstrated and its key elements are illustrated. Then,the optimization problem is decomposed into three directions in relevant to fitting ability:1)convergence,2)convergence speed,and 3)globalcontext quality. At the same time,we also summarize and analyze the specific methods and research results of these three optimization directions. Among them,convergence refers to running the algorithm and converging to a synthesis like a stationary point. The gradient exploding/vanishing problem is shown that small changes in a multi-layer network may amplify and stimuli or decline and disappear for each layer. The speed of convergence refers to the ability to assist the model to converge at a faster speed. After the convergence task of the model,the optimization algorithm to accelerate the model convergence should be considered to improve the performance of the model. The global-context quality problem is to ensure that the model converges to a lower solution(the global minimum). The first two problems are local-oriented and the last one is global-concerned. The boundary of these three problems is fuzzy,for example,some optimization methods to improve convergence can accelerate the convergence speed of the model as well. After the fitting optimization of the model,it is necessary to consider the large number of parameters in the deep learning model as well,which can cause poor generalization effect due to overfitting. Regularization can be regarded as an effective method for generalization. To improve the generalization ability of the model,current situation of regularization methods are categorized from two aspects:1)data processing and 2)model parameters-constrained. Data processing refers to data processing during model training,such as dataset enhancement,noise injection and adversarial training. These optimization methods can improve the generalization ability of the model effectively. Model parameters constraints are oriented to parameters-constrained in the network,which can also improve the generalization ability of the model. We take generative adversarial network(GAN)as the application background and review the growth of its variant model because it can be as a commonly-used deep learning network. We analyze the application of relevant optimization methods in GAN domain from two aspects of fitting and generalization ability. Taking WGAN with gradient penalty(WGAN-GP)as the basic model,we design an experiment on MNIST-10 dataset to study the applicability of the six algorithms(stochastic gradient method(SGD),momentum SGD,Adagrad,Adadelta, root mean square propagation(RMSProp),and Adam)in the context of deep learning based GAN domain. The optimization effects are compared and analyzed in relevant to the experimental results of multiple optimization methods on variants of GAN model,and some GAN-based optimization strategies are required to be clarified further. At present,various optimization methods have been widely used in deep learning models. Various optimization methods to improve the fitting ability can improve the performance of the model. Furthermore,these regularized optimization methods are beneficial to alleviate the problem of model overfitting and improve the robustness of the model. But,there is still a lack of systematic theories and mechanisms for guidance. In addition,there are still some optimization problems to be further studied. The Lipschitz limitation of global gradients is not guaranteed in deep neural networks due to the gap between theory and practice. In the field of GAN,there is still a lack of theoretical breakthroughs to find the stable global optimal solution,that is,the optimal Nash equilibrium. Moreover,some of the existing optimization methods are empirical and its interpretability is lack of clear theoretical proof. There are many and complex optimization methods in deep learning. The use of various optimization methods should be focused on the integrated effect of multiple optimizations. Our critical analysis is potential to provide a reference for the optimization method selection in the design of deep neural network.
- Research Article
16
- 10.2196/37365
- Jun 29, 2022
- JMIR medical informatics
BackgroundResearch on the diagnosis of COVID-19 using lung images is limited by the scarcity of imaging data. Generative adversarial networks (GANs) are popular for synthesis and data augmentation. GANs have been explored for data augmentation to enhance the performance of artificial intelligence (AI) methods for the diagnosis of COVID-19 within lung computed tomography (CT) and X-ray images. However, the role of GANs in overcoming data scarcity for COVID-19 is not well understood.ObjectiveThis review presents a comprehensive study on the role of GANs in addressing the challenges related to COVID-19 data scarcity and diagnosis. It is the first review that summarizes different GAN methods and lung imaging data sets for COVID-19. It attempts to answer the questions related to applications of GANs, popular GAN architectures, frequently used image modalities, and the availability of source code.MethodsA search was conducted on 5 databases, namely PubMed, IEEEXplore, Association for Computing Machinery (ACM) Digital Library, Scopus, and Google Scholar. The search was conducted from October 11-13, 2021. The search was conducted using intervention keywords, such as “generative adversarial networks” and “GANs,” and application keywords, such as “COVID-19” and “coronavirus.” The review was performed following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews (PRISMA-ScR) guidelines for systematic and scoping reviews. Only those studies were included that reported GAN-based methods for analyzing chest X-ray images, chest CT images, and chest ultrasound images. Any studies that used deep learning methods but did not use GANs were excluded. No restrictions were imposed on the country of publication, study design, or outcomes. Only those studies that were in English and were published from 2020 to 2022 were included. No studies before 2020 were included.ResultsThis review included 57 full-text studies that reported the use of GANs for different applications in COVID-19 lung imaging data. Most of the studies (n=42, 74%) used GANs for data augmentation to enhance the performance of AI techniques for COVID-19 diagnosis. Other popular applications of GANs were segmentation of lungs and superresolution of lung images. The cycleGAN and the conditional GAN were the most commonly used architectures, used in 9 studies each. In addition, 29 (51%) studies used chest X-ray images, while 21 (37%) studies used CT images for the training of GANs. For the majority of the studies (n=47, 82%), the experiments were conducted and results were reported using publicly available data. A secondary evaluation of the results by radiologists/clinicians was reported by only 2 (4%) studies.ConclusionsStudies have shown that GANs have great potential to address the data scarcity challenge for lung images in COVID-19. Data synthesized with GANs have been helpful to improve the training of the convolutional neural network (CNN) models trained for the diagnosis of COVID-19. In addition, GANs have also contributed to enhancing the CNNs’ performance through the superresolution of the images and segmentation. This review also identified key limitations of the potential transformation of GAN-based methods in clinical applications.
- Preprint Article
- 10.5194/egusphere-egu22-1650
- Mar 27, 2022
<p>Tropical Cyclones (TCs) are deadly but rare events that cause considerable loss of life and property damage every year. Traditional TC forecasting and tracking methods focus on numerical forecasting models, synoptic forecasting and statistical methods. However, in recent years there have been several studies investigating applications of Deep Learning (DL) methods for weather forecasting with encouraging results.</p><p>We aim to test the efficacy of several DL methods for TC nowcasting, particularly focusing on Generative Adversarial Neural Networks (GANs) and Recurrent Neural Networks (RNNs). The strengths of these network types align well with the given problem: GANs are particularly apt to learn the form of a dataset, such as the typical shape and intensity of a TC, and RNNs are useful for learning timeseries data, enabling a prediction to be made based on the past several timesteps.</p><p>The goal is to produce a DL based pipeline to predict the future state of a developing cyclone with accuracy that measures up to current methods.  We demonstrate our approach based on learning from high-resolution numerical simulations of TCs from the Indian and Pacific oceans and discuss the challenges and advantages of applying these DL approaches to large high-resolution numerical weather data.</p>
- Research Article
- 10.3390/jmse13020231
- Jan 25, 2025
- Journal of Marine Science and Engineering
The field of underwater image processing has gained significant attention recently, offering great potential for enhanced exploration of underwater environments, including applications such as underwater terrain scanning and autonomous underwater vehicles. However, underwater images frequently face challenges such as light attenuation, color distortion, and noise introduced by artificial light sources. These degradations not only affect image quality but also hinder the effectiveness of related application tasks. To address these issues, this paper presents a novel deep network model for single under-water image restoration. Our model does not rely on paired training images and incorporates two cycle-consistent generative adversarial network (CycleGAN) structures, forming a dual-CycleGAN architecture. This enables the simultaneous conversion of an underwater image to its in-air (atmospheric) counterpart while learning a light field image to guide the underwater image towards its in-air version. Experimental results indicate that the proposed method provides superior (or at least comparable) image restoration performance, both in terms of quantitative measures and visual quality, when compared to existing state-of-the-art techniques. Our model significantly reduces computational complexity, resulting in a more efficient approach that maintains superior restoration capabilities, ensuring faster processing times and lower memory usage, making it highly suitable for real-world applications.
- Conference Article
1
- 10.1109/icccnt51525.2021.9579725
- Jul 6, 2021
Imperfections or defects inevitably occur in images due to inexperienced photographers, inadequate methods of preservation, or even some deliberate hacking. Image restoration or completion has been performed using various manual methods in the past, be it being drawn by artists based on their creativity or deleting noise and blur effects using software like Photoshop. On a large scale, manual image completion is infeasible and has quite a lot of limitations. Modern advancements in Computer Vision and Deep Learning have allowed man to automate such tasks with high efficiency. Manual restoration usually relies on prior experience in the subject and sometimes even creativity to reconstruct the image based on the artist's imagination. At the same time, deep learning produces excellent results given enough training data. Deep learning methods can improvise and generalize better too and hence outperform the traditional manual methods. In this project, image completion is performed using 2 Deep Learning models - Convolutional Neural Networks(CNN) and Generative Adversarial Networks(GAN). Adversarial Networks have been proven to be very handy in image to image translation tasks and image reconstruction and hence this is explored widely. Both Deep Convolutional GANs as well as Conditional GANs are used for this task and their respective performances are compared for the above task.
- Research Article
4
- 10.3390/s23177338
- Aug 23, 2023
- Sensors (Basel, Switzerland)
As one of the representative models in the field of image generation, generative adversarial networks (GANs) face a significant challenge: how to make the best trade-off between the quality of generated images and training stability. The U-Net based GAN (U-Net GAN), a recently developed approach, can generate high-quality synthetic images by using a U-Net architecture for the discriminator. However, this model may suffer from severe mode collapse. In this study, a stable U-Net GAN (SUGAN) is proposed to mainly solve this problem. First, a gradient normalization module is introduced to the discriminator of U-Net GAN. This module effectively reduces gradient magnitudes, thereby greatly alleviating the problems of gradient instability and overfitting. As a result, the training stability of the GAN model is improved. Additionally, in order to solve the problem of blurred edges of the generated images, a modified residual network is used in the generator. This modification enhances its ability to capture image details, leading to higher-definition generated images. Extensive experiments conducted on several datasets show that the proposed SUGAN significantly improves over the Inception Score (IS) and Fréchet Inception Distance (FID) metrics compared with several state-of-the-art and classic GANs. The training process of our SUGAN is stable, and the quality and diversity of the generated samples are higher. This clearly demonstrates the effectiveness of our approach for image generation tasks. The source code and trained model of our SUGAN have been publicly released.
- Preprint Article
1
- 10.5194/egusphere-egu23-15285
- May 15, 2023
Recent technological advances allow geoscientists to generate high-resolution (HR) imagery using a variety of different beam-forming mechanisms (e.g. visible light, X-Rays, or charged particles such as electrons and ions). One of the main limitations in producing HR data is the required acquisition time at high magnifications. For example, back-scattered electron (BSE) mapping of a standard petrographic thin section at a resolution of 50nm/pixel takes approximately 60 days and is associated with a storage requirement in the order of 700 GB. Deep-learning methods have proven effective for resolution enhancement in regular photographic images, and in this work we present an integrated image registration and upscaling workflow to enhance image resolution, using real-world BSE datasets.The proposed workflow requires the acquisition of one, or multiple, HR regions within a region that is imaged at low-resolution (LR). Next, close to pixel-accurate image registration is performed by using the successive implementation of two concepts: i) first the precise location of the HR region within the LR region is determined by using a Fast Fourier Transform algorithm (Lewis, 2005), and ii) final image registration is achieved by iteratively calculating a deformation matrix that, using Newton’s method of optimization, is aiming to minimize an error function describing the differences between both images (Tudisco et al., 2017).Subsequently, matching HR and LR image pairs are fed into a Generative Adversarial Network (GAN) that learns to produce HR images from the LR counterparts. A GAN consists of two neural networks, a generator and a discriminator. The generator produces synthetic HR data based on LR input, and the discriminator attempts to classify the data as either real HR or synthetic HR. The two networks are trained together in an adversarial process, with the goal of the generator producing synthetic data that the discriminator cannot distinguish from real data.We demonstrate our method on a variety of large real-world datasets and show that it effectively increases the resolution of full-size BSE maps up to a factor of four, while being able to resolve important features. The upscaling of BSE data, with a factor of four, is associated with a 90% reduction in beamtime and a factor 16 reduction in storage requirements. Image registration, preprocessing, and model training on a high-performance workstation takes 12-24 hours. Having a trained model, inference can be done using a regular laptop.[1] Lewis, J. P. "Fast normalized cross-correlation, Industrial Light and Magic." unpublished (2005).[2] Tudisco, Erika, et al. "An extension of digital volume correlation for multimodality image registration." Measurement Science and Technology 28.9 (2017): 0954
- Front Matter
7
- 10.1136/bjophthalmol-2020-316300
- Aug 29, 2020
- British Journal of Ophthalmology
Generative adversarial networks (GANs)1 are deep learning (DL) methods, which are in turn a type of machine learning. In recent years, DL methods have been applied extensively in medicine and...
- Book Chapter
142
- 10.1007/978-3-030-58595-2_43
- Jan 1, 2020
We propose a novel Generative Adversarial Network (XingGAN or CrossingGAN) for person image generation tasks, i.e., translating the pose of a given person to a desired one. The proposed Xing generator consists of two generation branches that model the person’s appearance and shape information, respectively. Moreover, we propose two novel blocks to effectively transfer and update the person’s shape and appearance embeddings in a crossing way to mutually improve each other, which has not been considered by any other existing GAN-based image generation work. Extensive experiments on two challenging datasets, i.e., Market-1501 and DeepFashion, demonstrate that the proposed XingGAN advances the state-of-the-art performance both in terms of objective quantitative scores and subjective visual realness. The source code and trained models are available at https://github.com/Ha0Tang/XingGAN.
- Abstract
2
- 10.1016/j.ejmp.2019.09.126
- Dec 1, 2019
- Physica Medica
45 A comparison of pseudo-CT generation methods for prostate MRI-based dose planning: deep learning, patch-based, atlas-based and bulk-density methods
- Research Article
- 10.1155/2024/7498160
- Jan 1, 2024
- Journal of Sensors
Image restoration is a critical task in computer vision that involves recovering an original image from corrupted or damaged versions of it. Traditional image restoration relies on interpolation and completion techniques, such as Navier–Stokes equations and the fast multipole boundary element method. However, these methods often fail to capture the advanced semantic information of an image, leading to inaccurate restoration results. In recent years, generative adversarial networks (GANs) have emerged as a practical approach in computer vision for image restoration tasks, as they can address image restoration issues related to damaged or missing image information. GANs are a deep‐learning network structure with a generator and discriminator. In image restoration, the generator is employed to restore damaged images, and the discriminator is used to assess the authenticity of the repair results. Competition between the generator and discriminator improves the quality of the repair results. This study proposes a novel generative image restoration method that utilizes contextual and perceptual semantic information mechanisms to strengthen GANs. Our approach demonstrates the effectiveness of GANs in restoring images through learning how to fuse the missing or damaged parts of an image with the undamaged parts surrounding them, resulting in visually appealing restoration results.
- Research Article
3
- 10.1148/ryai.2021210125
- Sep 1, 2021
- Radiology: Artificial Intelligence
Radiology Alchemy: GAN We Do It?
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.