Abstract

The paper presents architecture features, the learning process, and the scope of generative deep learning models. The main tasks of such models include data generation (images, music, texts, videos), transferring styles from one data to another, improving data quality, data clustering, anomaly detection, etc. It is noted that the results of generative models are commonly used for entertainment purposes. In addition, they can be used as data for learning other machine learning models, sources of new ideas for creative professions, tools for anonymization of sensitive data, etc. The article analyzes the advantages and disadvantages of basic generative models like autoencoders, variational autoencoders, generative adversarial networks (GAN), Wasserstein GAN (WGAN), StyleGAN, StyleGAN2, and BigGAN. The paper also describes a step-by-step study of the generative model implementation on the example of WGAN, which includes the basic architecture implementation and more complex elements. Examples of such elements are the introduction of conditional generation to add the ability to select the desired class and the algorithm of bilinear sampling to solve the problem of the so-called ‘checkerboard effect’. The final model, created as a result of the study and named CWGAN-GP_128, is capable of generating realistic images of dandelions and marigolds at a resolution of 128x128 pixels. The model learned on the authors' data set consists of 900 photos (450 for each class). The learning process includes affine transformations such as rotations and inversions to augment the images. It is emphasized that although the results of generative models are often easy to evaluate visually, along with the rapid progress of GAN, the problem of automating the process of checking the quality of generated data is growing. The final model is open for public access, and the results are accessible on the authors' website thisflowerdoesnotexist.herokuapp.com.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.