GANs have recently become one of the most promising directions of machine learning for generative tasks in deep learning. By using architectures such as CNN, GANs can learn relationships in input data without any human intervention and generate new examples that are similar to the original data set. This paper aims at understanding what GANs are and how they work, and the different types of GAN models with a special focus on their application in image synthesis, style transfer, and text-to-image synthesis. The GAN framework consists of two neural networks: The two primary components of the model are the Generator and the Discriminator. The generator’s goal is to generate ‘realistic’ data such that the discriminator cannot differentiate between the two, while the discriminator’s goal is to distinguish between the real and the fake data. Some other GAN variants include Vanilla GAN, conditional GAN (CGAN), Deep Convolutional GAN (DCGAN), Laplacian Pyramid GAN (LAPGAN) and Super Resolution GAN (SRGAN), all of which have made significant improvements in producing convincing images. The structures of the generator model, which produces data from noise, and the discriminator model, which checks the validity of the data produced, are also considered. The study focuses on the concept of loss functions including the generator loss, discriminator loss and minimax loss in the training of GANs. Pre- processing the data for GAN training is a very crucial step that requires careful consideration as well as data cleaning. This is because when using content and adversarial loss functions simultaneously, it is easier to find the right balance between the photorealism of the images and their structural similarity to the source material. By training carefully and iterating over the process, GANs exhibit high performance in sample generation, thereby leading to the new state of generative modeling. It will be beneficial for any researcher or practitioner by providing an understanding of artificial intelligence and generative modeling
Read full abstract