Abstract

Deep generative models have significantly advanced image generation, enabling generation of visually pleasing images with realistic texture. Apart from the texture, it is the shape geometry of objects that strongly dictates their appearance. However, currently available generative models do not incorporate geometric information into the image generation process. This often yields visual objects of degenerated quality. In this work, we propose a regularized Geometry-Aware Generative Adversarial Network (GAGAN) which disentangles appearance and shape in the latent space. This regularized GAGAN enables the generation of images with both realistic texture and shape. Specifically, we condition the generator on a statistical shape prior. The prior is enforced through mapping the generated images onto a canonical coordinate frame using a differentiable geometric transformation. In addition to incorporating geometric information, this constrains the search space and increases the model’s robustness. We show that our approach is versatile, able to generalise across domains (faces, sketches, hands and cats) and sample sizes (from as little as sim , 200{-}30{,}000 to more than 200, 000). We demonstrate superior performance through extensive quantitative and qualitative experiments in a variety of tasks and settings. Finally, we leverage our model to automatically and accurately detect errors or drifting in facial landmarks detection and tracking in-the-wild.

Highlights

  • While a surge of computational, data-driven methods that rely on variational inference (Kingma and Welling 2014; Rezende et al 2014) and autoregressive modelling have recently proposed for image generation, it is the introduction of Generative Adversarial Networks (GANs) (Goodfellow et al 2014) that significantly advanced image generation enabling creation of imagery with realistic visual texture

  • Current methods for realistic image generation mainly rely on the three types of deep generative models, namely Variational Autoencoders (VAEs), autoregressive models, and Generative Adversarial Networks (GANs)

  • This section discusses quantitative results, especially we focus on the discriminative ability of Geometry-Aware Generative Adversarial Network (GAGAN) to verify landmark detections

Read more

Summary

Introduction

The generation of realistic images is a longstanding problem in computer vision and graphics with numerous applications, including photo-editing, computer-aided design, image stylisation (Johnson et al 2016; Zhu et al 2017) as well as image de-noising While a surge of computational, data-driven methods that rely on variational inference (Kingma and Welling 2014; Rezende et al 2014) and autoregressive modelling (van den Oord et al 2016; Salimans et al 2017) have recently proposed for image generation, it is the introduction of Generative Adversarial Networks (GANs) (Goodfellow et al 2014) that significantly advanced image generation enabling creation of imagery with realistic visual texture. Visual texture (e.g., skin texture of faces, lighting) as well as pose and deformations (e.g., facial expressions, view angle) affect the appearance of a visual object The interactions of these texture and geometric factors emulate the entangled variability, International Journal of Computer Vision (2019) 127:824–844. By sampling from the statistical shape model we can generate faces with arbitrary facial attributes such as facial expression, pose and morphology

Generative Models for Image Generation
Statistical Shape Models
Geometry-Aware GAN
Building the Shape Model
Enforcing the Geometric Prior
Local Appearance Preservation
Data Augmentation with Perturbations
Experimental Setting
Generality of GAGAN
Limitations
Qualitative Results
Quantitative Results
Improvement Through Regularisation
Ablation Study
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.