Infrared and visible image fusion based on edge-preserving and attention generative adversarial network

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

Infrared and visible image fusion based on edge-preserving and attention generative adversarial network

Similar Papers
  • Conference Article
  • Cite Count Icon 4
  • 10.1109/icsai53574.2021.9664094
Vehicle Trajectory Prediction Based on Attention Mechanism and GAN
  • Nov 13, 2021
  • Yi Wang + 3 more

To address the problem that the Social Generative Adversarial Network (SGAN) cannot fully extract the hidden state of vehicle movement, and does not get enough interactive information between vehicles, a vehicle trajectory prediction model Attentive Generative Adversarial Network (AGAN) based on the attention mechanism and the generative adversarial network is proposed. Among them, the historical attention mechanism calculates the focus of the vehicle in the historical hidden state, and the social attention mechanism calculates the weight of the influence of surrounding vehicles on the target vehicle. Combining historical and social attention mechanisms can obtain vehicle movement information that includes both time and space influencing factors. With the help of the generative adversarial network for global joint training, it is possible to generate a future trajectory that conforms to physical constraints and social norms. Experiments show that compared with SGAN model, AGAN in the ADE model and FDE index decreased by 4.4% and 3.8%.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 19
  • 10.3390/plants12173105
High-Accuracy Maize Disease Detection Based on Attention Generative Adversarial Network and Few-Shot Learning.
  • Aug 29, 2023
  • Plants
  • Yihong Song + 7 more

This study addresses the problem of maize disease detection in agricultural production, proposing a high-accuracy detection method based on Attention Generative Adversarial Network (Attention-GAN) and few-shot learning. The method introduces an attention mechanism, enabling the model to focus more on the significant parts of the image, thereby enhancing model performance. Concurrently, data augmentation is performed through Generative Adversarial Network (GAN) to generate more training samples, overcoming the difficulties of few-shot learning. Experimental results demonstrate that this method surpasses other baseline models in accuracy, recall, and mean average precision (mAP), achieving 0.97, 0.92, and 0.95, respectively. These results validate the high accuracy and stability of the method in handling maize disease detection tasks. This research provides a new approach to solving the problem of few samples in practical applications and offers valuable references for subsequent research, contributing to the advancement of agricultural informatization and intelligence.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 23
  • 10.1186/s40494-023-00882-y
EA-GAN: restoration of text in ancient Chinese books based on an example attention generative adversarial network
  • Mar 1, 2023
  • Heritage Science
  • Zheng Wenjun + 4 more

Ancient Chinese books are of great significance to historical research and cultural inheritance. Unfortunately, many of these books have been damaged and corroded in the process of long-term transmission. The restoration by digital preservation of ancient books is a new method of conservation. Traditional character restoration methods ensure the visual consistency of character images through character features and the pixels around the damaged area. However, reconstructing characters often causes errors, especially when there is large damage in critical locations. Inspired by human’s imitation writing behavior, a two-branch structure character restoration network EA-GAN (Example Attention Generative Adversarial Network) is proposed, which is based on a generative adversarial network and fuses reference examples. By referring to the features of the example character, the damaged character can be restored accurately even when the damaged area is large. EA-GAN first uses two branches to extract the features of the damaged and example characters. Then, the damaged character is restored according to neighborhood information and features of the example character in different scales during the up-sampling stage. To solve problems when the example and damaged character features are not aligned and the convolution receptive field is too small, an Example Attention block is proposed to assist in restoration. Qualitative and quantitative analysis experiments are carried out on a self-built dataset MSACCSD and real scene pictures. Compared with current inpainting networks, EA-GAN can get the correct text structure through the guidance of the additional example in the Example Attention block. The peak signal-to-noise ratio (PSNR) and the structural similarity (SSIM) value increased by 9.82% and 1.82% respectively. The learned perceptual image patch similarity (LPIPS) value calculated by Visual Geometry Group (VGG) network and AlexNet decreased by 35.04% and 16.36% respectively. Our method obtained better results than the current inpainting methods. It also has a good restoration effect in the face of untrained characters, which is helpful for the digital preservation of ancient Chinese books.

  • Research Article
  • Cite Count Icon 10
  • 10.1007/s41651-024-00187-z
AU3-GAN: A Method for Extracting Roads from Historical Maps Based on an Attention Generative Adversarial Network
  • Jul 16, 2024
  • Journal of Geovisualization and Spatial Analysis
  • Yao Zhao + 4 more

In recent years, the integration of deep learning technology based on convolutional neural networks with historical maps has made it possible to automatically extract roads from these maps, which is highly important for studying the evolution of transportation networks. However, the similarity between roads and other features (such as contours, water systems, and administrative boundaries) poses a significant challenge to the feature extraction capabilities of convolutional neural networks (CNN). Additionally, CNN require a large quantity of labelled data for training, which can be a complex issue for historical maps. To address these limitations, we propose a method for extracting roads from historical maps based on an attention generative adversarial network. This approach leverages the unique architecture and training methodology of the generative adversarial network to augment datasets by generating data that closely resembles real samples. Meanwhile, we introduce an attention mechanism to enhance UNet3 + and achieve accurate historical map road segmentation images. We validate our method using the Third Military Mapping Survey of Austria-Hungary and compare it with a typical U-shaped network. The experimental results show that our proposed method outperforms the direct use of the U-shaped network, achieving at least an 18.26% increase in F1 and a 7.62% increase in the MIoU, demonstrating its strong ability to extract roads from historical maps and provide a valuable reference for road extraction from other types of historical maps.

  • Research Article
  • 10.1088/1742-6596/1575/1/012079
Single Image De-raining Based on a Novel Enhanced Attentive Generative Adversarial Network
  • Jun 1, 2020
  • Journal of Physics: Conference Series
  • Haochen Zhou + 1 more

With rapid development of deep learning in artificial intelligence and computer vision, generative adversarial network plays an important role in single image de-raining. Attentive generative adversarial network (AttGAN) has problems about complicated network structure and distortion of background color. Considering the relative complex background in the real image with raindrops, a new single image de-raining algorithm based on enhanced attentive generative adversarial network (EAttGAN) is proposed to retain the original background of the blurred image. In order to restore more complete background and accelerate network training, a generator enhanced by residual scaling and a Markovian discriminator are fused effectively in the network. Compared with AttGAN, experimental results indicate that EAttGAN can not only achieve higher sharpness of a single image, but also take less time in the training process.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 12
  • 10.3390/rs15133306
Enhancing Spatial Variability Representation of Radar Nowcasting with Generative Adversarial Networks
  • Jun 28, 2023
  • Remote Sensing
  • Aofan Gong + 5 more

Weather radar plays an important role in accurate weather monitoring and modern weather forecasting, as it can provide timely and refined weather forecasts for the public and for decision makers. Deep learning has been applied in radar nowcasting tasks and has exhibited a better performance than traditional radar echo extrapolation methods. However, current deep learning-based radar nowcasting models are found to suffer from a spatial “blurry” effect that can be attributed to a deficiency in spatial variability representation. This study proposes a Spatial Variability Representation Enhancement (SVRE) loss function and an effective nowcasting model, named the Attentional Generative Adversarial Network (AGAN), to alleviate this blurry effect by enhancing the spatial variability representation of radar nowcasting. An ablation experiment and a comparison experiment were implemented to assess the effect of the generative adversarial (GA) training strategy and the SVRE loss, as well as to compare the performance of the AGAN and SVRE loss function with the current advanced radar nowcasting models. The performances of the models were validated on the whole test set and inspected in two storm cases. The results showed that both the GA strategy and SVRE loss function could alleviate the blurry effect by enhancing the spatial variability representation, which helps the AGAN to achieve better nowcasting performance than the other competitor models. Our study provides a feasible solution for high-precision radar nowcasting applications.

  • Research Article
  • 10.3390/app15084560
DAGANFuse: Infrared and Visible Image Fusion Based on Differential Features Attention Generative Adversarial Networks
  • Apr 21, 2025
  • Applied Sciences
  • Yuxin Wen + 1 more

The purpose of multi-modal visual information fusion is to integrate the data of multi-sensors to generate an image with higher quality, more information, and greater clarity so that it contains more complementary information and fewer redundant features. Infrared sensors detect thermal radiation emitted by objects, which is related to their temperature, whereas visible light sensors generate images by capturing the light that interacts with objects, including reflection, diffusion, and transmission. However, due to the different principles of infrared and visible light sensors, there is a large similarity difference between the generated infrared and visible images, which makes it difficult to extract complementary information. Existing methods generally use simple splicing or addition methods to fuse features at the fusion layer without considering the intrinsic features of different modal images and the interaction of features between different scales. Moreover, only correlation is considered. On the contrary, the image fusion task needs to pay more attention to their complementarity. For this reason, we introduce a cross-scale differential features attention generative adversarial fusion network, namely DAGANFuse. In the generator, we designed a cross-modal differential features attention module to fuse the intrinsic content of different modal images. We proposed a parallel path calculation of differential features and fusion features for attention weights and performed parallel spatial and channel attention weight calculations on the two paths. In the discriminator, a dual discriminator was used to maintain the information balance between different modalities and avoid common problems such as information blurring and loss of texture details. Experimental results show that our DAGANFuse has state-of-the-art (SOTA) performance and is superior to existing methods in terms of fusion performance.

  • Book Chapter
  • 10.1007/978-3-030-89698-0_81
AttnGAN++: Enhencing the Edge of Images on AttnGAN
  • Jan 1, 2022
  • Pingan Qiao + 2 more

The text-to-image synthesis has made great progress at present, but the extraction of key semantic information and the restoration of the edge details of the generated image are still not perfect. In this paper, we proposed Attentional Generative Adversarial Network++ (AttnGAN++) model based on AttnGAN that allows to effectively solve the problem of missing the edge information of the generated image and extracting insufficient text features. First, we introduced Bi-directional Gated Recurrent Unit model (BiGRU), which can still ensure sufficient extraction of contextual information in processing long texts. This model combined with the attention mechanism, to achieve higher weights for the important words of the text. Then, we proposed the edge enhancement network consists four modules: Edge extraction, Residual Dense Block (RDB), Edge enhancement fusion, and Up-sampling model. Edge extraction obtains the edge of the final generated image. RDB to form a continuous memory mechanism, adaptively learn from previously more effective features to enhance feature propagation. Edge enhancement fusion and up-sampling modules to fuse the edge information and global information to generate high-resolution images with clearer edges. Thorough experiments on CUB dataset demonstrate that Attentional Generative Adversarial Network++ model significantly outperforms Attentional Generative Adversarial Network, boosting the best reported inception score by 3.78% and R-precision by 9.71% on CUB dataset, which can generate clearer image edges and improve the quality of the image.KeywordsText-to-imageAttentional generative adversarial networkEdge enhancement networkBi-directional gated recurrent unitResidual dense block

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 6
  • 10.3390/rs14153509
Inverse Synthetic Aperture Radar Imaging Using an Attention Generative Adversarial Network
  • Jul 22, 2022
  • Remote Sensing
  • Yanxin Yuan + 3 more

The traditional inverse synthetic aperture radar (ISAR) imaging uses matched filtering and pulse accumulation methods. When improving the resolution and real-time performance, there are some problems, such as the high sampling rate and large amount of data. Although the compressed sensing (CS) method can realize high-resolution imaging with small sampling data, the sparse reconstruction algorithm has high computational complexity and is time-consuming. The imaging result is limited by the model and sparsity hypothesis. We propose a novel CS-ISAR imaging method using an attention generative adversarial network (AGAN). The generator of AGAN is a modified U-net consisting of both spatial and channel-wise attention. The trained generator can learn the imaging operation from down-sampling data to high-resolution ISAR images. Simulations and measured data experiments are given to validate the advantage of the proposed method.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 5
  • 10.3390/app11157034
Restoring Raindrops Using Attentive Generative Adversarial Networks
  • Jul 30, 2021
  • Applied Sciences
  • Suhan Goo + 1 more

Artificial intelligence technologies and vision systems are used in various devices, such as automotive navigation systems, object-tracking systems, and intelligent closed-circuit televisions. In particular, outdoor vision systems have been applied across numerous fields of analysis. Despite their widespread use, current systems work well under good weather conditions. They cannot account for inclement conditions, such as rain, fog, mist, and snow. Images captured under inclement conditions degrade the performance of vision systems. Vision systems need to detect, recognize, and remove noise because of rain, snow, and mist to boost the performance of the algorithms employed in image processing. Several studies have targeted the removal of noise resulting from inclement conditions. We focused on eliminating the effects of raindrops on images captured with outdoor vision systems in which the camera was exposed to rain. An attentive generative adversarial network (ATTGAN) was used to remove raindrops from the images. This network was composed of two parts: an attentive-recurrent network and a contextual autoencoder. The ATTGAN generated an attention map to detect rain droplets. A de-rained image was generated by increasing the number of attentive-recurrent network layers. We increased the number of visual attentive-recurrent network layers in order to prevent gradient sparsity so that the entire generation was more stable against the network without preventing the network from converging. The experimental results confirmed that the extended ATTGAN could effectively remove various types of raindrops from images.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 4
  • 10.1155/2020/1037021
A Novel Attentive Generative Adversarial Network for Waterdrop Detection and Removal of Rubber Conveyor Belt Image
  • Feb 22, 2020
  • Mathematical Problems in Engineering
  • Xianguo Li + 5 more

The lens for monitoring the rubber conveyor belt is easy to adhere to a large number of water droplets, which seriously affects the image quality and then affects the effect of fault monitoring. In this paper, a new method for detecting and removing water droplets on rubber conveyor belts based on the attentive generative adversarial network is proposed to solve this problem. First, the water droplet image of the rubber conveyor belt is input into the generative network composed of a cyclic visual attentive network and an autoencoder with skip connections, and an image of removing water droplets and an attention map for detecting the position of the water droplet are generated. Then, the generated image of removing water droplets is evaluated by the attentive discriminant network to assess the local consistency of the water droplet recovery area. In order to better learn the water droplet regions and the surrounding structures during the training, the image morphology is added to the precise water droplet regions. A dewatered rubber conveyor belt image is generated by increasing the number of circular visual attention network layers and the number of skip connection layers of the autoencoder. Finally, a large number of comparative experiments prove the effectiveness of the water droplet image removal algorithm proposed in this paper, which outperforms of Convolutional Neural Network (CNN), Discriminative Sparse Coding (DSC), Layer Prior (LP), and Attention Generative Adversarial Network (ATTGAN).

  • Research Article
  • 10.1007/s42452-025-07306-5
Research on deep learning framework for multi scale information graph generation and visualization enhancement based on self attention generative Adversarial Network
  • Jun 22, 2025
  • Discover Applied Sciences
  • Qian Zhou

With the widespread adoption of Generative Adversarial Networks (GANs) in image generation and processing, enhancing their generation quality and visualization capabilities has become a prominent research focus. This study introduces a deep learning framework that integrates multi-scale information chart generation with visualization enhancement to improve the performance of GAN-based image generation models across various domains. Based on the Self-Attention Generative Adversarial Network (SAGAN), which leverages self-attention mechanisms to capture long-range dependencies in images, the proposed approach significantly enhances image quality and detail representation. The framework incorporates a multi-scale feature extraction method to optimize the feature maps at each layer of the generative network. Experimental results demonstrate that SAGAN outperforms traditional GAN models in terms of image clarity, detail preservation, and visual effects. The proposed model achieves notable improvements in diversity and generalization, with a mutual information content of 0.91, clustering uniformity of 0.89, and inter-cluster dissimilarity of 0.92 on the CelebA dataset. Furthermore, in terms of image quality, SAGAN attains a Structural Similarity Index Measure (SSIM) of 0.94 and a Peak Signal-to-Noise Ratio (PSNR) of 30.1, surpassing traditional GANs by a significant margin.

  • Research Article
  • 10.1038/s41598-024-74229-3
Specular highlight removal by federated generative adversarial network with attention mechanism
  • Oct 8, 2024
  • Scientific Reports
  • Yuanfeng Zheng + 1 more

Specular highlight removal ensures the acquisition of high-quality images, which finds its important applications in stereo matching, text recognition and image segmentation. In order to prevent the leakage of images containing personal information, such as identification card (ID) photos, clients often train specular highlight removal models using local data resulting in a lack of precision and generalization of the trained model. To address this challenge, this paper introduces a new method to remove highlight in images using federated learning (FL) and attention generative adversarial network (AttGAN). Specifically, the former builds a global model in the central server and updates the global model by aggregating model parameters of clients. This process does not involve the transmission of image data, which enhances the privacy of clients; the later combining attention mechanisms and generative adversarial network aims to improve the quality of highlight removal by focusing on key image regions, resulting in more realistic and visually pleasing results. The proposed FL-AttGAN method is numerically evaluated, using SD1, SD2 amd RD datasets. The results show that the proposed FL-AttGAN outperforms existent methods.

  • Research Article
  • Cite Count Icon 3
  • 10.1109/mmul.2020.3048939
Class-Balanced Text to Image Synthesis With Attentive Generative Adversarial Network
  • Jan 7, 2021
  • IEEE MultiMedia
  • Min Wang + 5 more

Although the text-to-image synthesis task has shown significant progress, it still remains a challenge in generating high-quality images. In this article, we first propose an attention-driven, cycle-refinement generative adversarial network, AGAN-v1, to bridge the domain gap between visual contents and semantic concepts by constructing spatial configurations of objects. The generation of image contours is the core component, in which an attention mechanism is developed to refine local details of images by focusing on the objects that complement one subregion. Second, an advanced class-balanced generative adversarial network, AGAN-v2, is proposed to address the problem of long-tailed data distribution. Importantly, it is the first method to solve this problem in the text-to-image synthesis task. Our AGAN-v2 introduces a reweighting scheme, which adopts the effective number of samples for each class to rebalance the generative loss. Extensive quantitative and qualitative experiments on CUB and MS-COCO datasets demonstrate that the proposed AGAN-v2 significantly outperforms the state-of-the-art methods.

  • Book Chapter
  • Cite Count Icon 4
  • 10.1007/978-3-030-98388-8_35
AttnGAN: Realistic Text-to-Image Synthesis with Attentional Generative Adversarial Networks
  • Jan 1, 2022
  • Shubham Mathesul + 2 more

In this paper, we propose a prototype design for manifold refinement to fine grained text-to-image generation by using Attentional Generative Adversarial Network (AttnGAN) We concentrate on creating realistic images from text descriptions. We have used a collection of Attentional Generative Adversarial Network layers that are able to correctly select the modal meaning at the word-level and sentence-level. Generative Adversarial Networks (GANs) prove to be fundamental structure for many design applications from Game design, Art, Science and Modelling applications. We use GANs for contrastive learning and as a information maximisation approach, and we do extensive research to find the further advancements in image generation. Our prototype is easy to implement and practical; choosing the most relevant word vectors and using those vectors to generate related image sub-regions. The prototype in its current state generates image designs only for the bird species to satisfy the claim for its image generation ability. With due consideration to findings of usability testing, the develpment team in future iterations of the application, hopes to improve the generated image resolution. They plan to provide a choice for created variety of images with further improvements to the image generation algorithm.KeywordsGANText-to-image synthesisArtificial intelligenceArtificial neural networksDAMSMAttentional Generative Adversarial Networks

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.