<p><span>Face synthesis and editing has increased cumulative consideration by the improvement of generative adversarial networks (GANs). The proposed attentional GAN-deep attentional multimodal similarity modal (AttnGAN-DAMSM) model focus on generating high-resolution images by removing discriminator components and generating realistic images from textual description. The attention model creates the attention map on the image and automatically retrieves the features to produce various sub-areas of the image. The DAMSM delivers fine-grained image-text identical loss to generative networks. This study, first describe text phrases and the model will generate a photorealistic high-resolution image composed of features with high accuracy. Next, model will fine-tune the selected features of face images and it will be left to the control of the user. The result shows that the proposed AttnGAN-DAMSM model delivers the performance metrics like structural similarity index measure (SSIM), feature similarity index measure (FSIM) and frechet inception distance (FID) using CelebA and CUHK face sketch (CUFS) dataset. For CelebFaces attribute (CelebA) dataset, the SSIM achieves 78.82% and for CUFS dataset, the SSIM achieves 81.45% which ensures accurate face synthesis and editing compared with existing methods such as GAN, SuperstarGAN and identity-sensitive GAN (IsGAN) models.</span></p>
Read full abstract