Abstract

Current AI systems have shown impressive results in the Automatic synthesis of realistic images from text descriptions tasks. In fact, Generative Adversarial Networks (GANs) are widely used in text-to-image generation tasks. The generator generates realistic images given the noise and sentence vectors, and the discriminator produces a probability of how the synthetic images are reals. In this paper, in order to generate images from Arabic text, we fuse DF-GAN as a sample and efficient text-to-image generation framework and AraBERT architecture. To achieve this purpose, firstly, we re-create new datasets matching the Arabic text-to-image generation task by applying DeepL-Translator from English to Arabic on text descriptions of original datasets. Secondly, we leverage the power of AraBERT which is trained on billions of Arabic words to produce a strong sentence embedding, and we reduce that vector’s dimension to match with DF-GAN shape. Thirdly, we inject the reduced sentence embedding into the UPBlocks sections of DF-GAN and we train the proposed architecture on two challenging datasets. Following the previous works, we use CUB and Oxford-102 flowers as original datasets. Further, we measure our framework with FID and IS. Our framework is the first that achieve much success in generating high-resolution realistic and text-matching images conditioned with Arabic text.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call