Abstract

Over the past few years, Generative Adversarial Networks (GANs) have garnered increased interest among researchers in Computer Vision, with applications including, but not limited to, image generation, translation, imputation, and super-resolution. Nevertheless, no GAN-based method has been proposed in the literature that can successfully represent, generate or translate 3D facial shapes (meshes). This can be primarily attributed to two facts, namely that (a) publicly available 3D face databases are scarce as well as limited in terms of sample size and variability (e.g., few subjects, little diversity in race and gender), and (b) mesh convolutions for deep networks present several challenges that are not entirely tackled in the literature, leading to operator approximations and model instability, often failing to preserve high-frequency components of the distribution. As a result, linear methods such as Principal Component Analysis (PCA) have been mainly utilized towards 3D shape analysis, despite being unable to capture non-linearities and high frequency details of the 3D face—such as eyelid and lip variations. In this work, we present 3DFaceGAN, the first GAN tailored towards modeling the distribution of 3D facial surfaces, while retaining the high frequency details of 3D face shapes. We conduct an extensive series of both qualitative and quantitative experiments, where the merits of 3DFaceGAN are clearly demonstrated against other, state-of-the-art methods in tasks such as 3D shape representation, generation, and translation.

Highlights

  • Generative Adversarial Networks (GANs) are a promising unsupervised machine learning methodology implemented by a system of two deep neural networks competing against each other in a zero-sum game framework (Goodfellow et al 2014)

  • Despite the great success GANs have had in 2D image/ video generation, representation, and translation, no GAN method tailored towards tackling the aforementioned tasks in 3D shapes has been introduced in the literature

  • We study the task of representation, generation, and translation of 3D facial surfaces using GANs

Read more

Summary

Introduction

GANs are a promising unsupervised machine learning methodology implemented by a system of two deep neural networks competing against each other in a zero-sum game framework (Goodfellow et al 2014). Despite the great success GANs have had in 2D image/ video generation, representation, and translation, no GAN method tailored towards tackling the aforementioned tasks in 3D shapes has been introduced in the literature This is primarily attributed to the lack of appropriate decoder networks for meshes that are able to retain the high frequency details (Dosovitskiy and Brox 2016; Jackson et al 2017). Geometric deep learning methods are not yet suitable for the problem we study in this paper Another way to work with 3D meshes is to concatenate the coordinates of the 3D points in an 1D vector and utilize fully connected layers to decode correctly the structure of the point cloud (Fan et al 2017; Qi et al 2017).

Objective Function
Training Procedure
Experiments
The Hi-Lo database
Data Preprocessing
Training
Baseline Models
Error Metric
Ablation Study
Multi-label 3D Face Translation
Cross-Dataset Attribute Transfer
Multi-label 3D Face Generation
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call