Self-Attention-Based Edge Computing Model for Synthesis Image to Text through Next-Generation AI Mechanism

Hamdan Ali Alshehri,R John Martin,Saima Ahmed Rahin,Poonam Panwar,Kirti Shukla,N Junath,Vijay Kumar

doi:10.1155/2022/4973535

Abstract

Image synthesis based on natural language description has become a research hotspot in edge computing in artificial intelligence. With the help of generative adversarial edge computing networks, the field has made great strides in high-resolution image synthesis. However, there are still some defects in the authenticity of synthetic single-target images. For example, there will be abnormal situations such as “multiple heads” and “multiple mouths” when synthesizing bird graphics. Aiming at such problems, a text generation single-target model SA-AttnGAN based on a self-attention mechanism is proposed. SA-AttnGAN (Attentional Generative Adversarial Network) refines text features into word features and sentence features to improve the semantic alignment of text and images; in the initialization stage of AttnGAN, the self-attention mechanism is used to improve the stability of the text-generated image model; the multistage GAN network is used to superimpose, finally synthesizing high-resolution images. Experimental data show that SA-AttnGAN outperforms other comparable models in terms of Inception Score and Frechet Inception Distance; synthetic image analysis shows that this model can learn background and colour information and correctly capture bird heads and mouths. The structural information of other components is improved, and the AttnGAN model generates incorrect images such as “multiple heads” and “multiple mouths.” Furthermore, SA-AttnGAN is successfully applied to description-based clothing image synthesis with good generalization ability.

Full Text