Abstract
Concerning facial expression generation, relying on the sheer volume of training data, recent advances on generative models allow high-quality generation of facial expressions free of the laborious facial expression annotating procedure. However, these generative processes have limited relevance to the psychological conceptualised dimensional plane, i.e., the Arousal-Valence two-dimensional plane, resulting in the generation of psychological uninterpretable facial expressions. For this, in this research, we seek to present a novel generative model, targeting learning the psychological compatible (low-dimensional) representations of facial expressions to permit the generation of facial expressions along the psychological conceptualised Arousal-Valence dimensions. To generate Arousal-Valence compatible facial expressions, we resort to a novel form of the data-driven generative model, i.e., the encapsulated variational auto-encoders (EVAE), which is consisted of two connected variational auto-encoders. Two harnessed variational auto-encoders in our EVAE model are concatenated with a tuneable continuous hyper-parameter, which bounds the learning of EVAE. Since this tuneable hyper-parameter, along with the linearly sampled inputs, largely determine the process of generating facial expressions, we hypothesise the correspondence between continuous scales on the hyper-parameter and sampled inputs, and the psychological conceptualised Arousal-Valence dimensions. For empirical validations, two public released facial expression datasets, e.g., the Frey faces and FERG-DB datasets, were employed here to evaluate the dimensional generative performance of our proposed EVAE. Across two datasets, the generated facial expressions along our two hypothesised continuous scales were observed in consistent with the psychological conceptualised Arousal-Valence dimensions. Applied our proposed EVAE model to the Frey faces and FERG-DB facial expression datasets, we demonstrate the feasibility of generating facial expressions along with the conceptualised Arousal-Valence dimensions. In conclusion, to generate facial expressions along the psychological conceptualised Arousal-Valance dimensions, we propose a novel type of generative model, i.e., encapsulated variational auto-encoders (EVAE), allowing the generation process to be disentangled into two tuneable continuous factors. Validated in two publicly available facial expression datasets, we demonstrate the association between these factors and Arousal-Valence dimensions in facial expression generation, deriving the data-driven Arousal-Valence plane in affective computing. Despite its embryonic stage, our research may shed light on the prospect of continuous, dimensional affective computing.
Highlights
Dimensional Expression GenerationFacial expression is one of the primary channels in human beings to express emotions and sentiments
We propose a novel form of variational autoencoder: the encapsulated variational auto-encoders (EVAE) that is tailored for learning data-driven dimensional affective representations of facial expressions
We propose a novel form of variational auto-encoder: encapsulated variational auto-encoder (EVAE) to automatically learn psychological conceptualised Arousal-valence (A-V) dimensional representations of human facial expressions
Summary
Facial expression is one of the primary channels in human beings to express emotions and sentiments. A novel form of VAE that allows the encoding of disentangled latent representations and deriving two continuous scales that correspond to the arousal and valence dimensions, is on-demand To this end, we propose a novel form of variational autoencoder: the encapsulated variational auto-encoders (EVAE) that is tailored for learning data-driven dimensional affective representations of facial expressions. This trend calls for the release of large-scale dimensional annotated affect datasets, e.g., Aff-Wild [12], SEMINE [13], and AFFECTNET [14] Relying on these datasets, several previous researches were able to revise the original discrete based classification models to regression ones that outputted the continuous arousal and valence values [15, 16]. A continuous scale of is hypothesised here to represent the data-driven arousal dimension
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.