Abstract

The performance of recognition systems can be significantly affected by various factors, with facial expression poses and lighting changes being the main confounding factors. In order to minimize their impact, we propose an intriguing method in this study that enables the generation of high-quality images specifically tailored to the target domain. Our objective is to utilize disentangled representation to effectively model the decomposition of data variations and generate neutral facial expression images with frontal posture and adaptive illumination. To achieve this, we incorporate 3D priors in the adversarial learning during the training process, simulating the generation of an analytical 3D face deformation as well as rendering operations. Additionally, we employ contrastive learning to control the disentanglement of the generated faces while preserving the essential properties of facial features. This technique enables us to learn an embedding space in which similar data samples are represented closely, while distinct samples are kept far apart from each other. Furthermore, we conduct an analysis of the learned latent space and introduce several other significant properties that enhance the reinforcement of factor disentanglement. These properties include an imitation learning algorithm, which facilitates the acquisition of meaningful patterns and characteristics.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call