Abstract
Automatic affect analysis has attracted great interest in various contexts including the recognition of action units and basic or non-basic emotions. In spite of major efforts, there are several open questions on what the important cues to interpret facial expressions are and how to encode them. In this paper, we review the progress across a range of affect recognition applications to shed light on these fundamental questions. We analyse the state-of-the-art solutions by decomposing their pipelines into fundamental components, namely face registration, representation, dimensionality reduction and recognition. We discuss the role of these components and highlight the models and new trends that are followed in their design. Moreover, we provide a comprehensive analysis of facial representations by uncovering their advantages and limitations; we elaborate on the type of information they encode and discuss how they deal with the key challenges of illumination variations, registration errors, head-pose variations, occlusions, and identity bias. This survey allows us to identify open issues and to define future directions for designing real-world affect recognition systems.
Highlights
THE production, perception and interpretation of facial expressions have been analysed for a long time across various disciplines such as biology [32], psychology [38], neuroscience [40], sociology [164] and computer science [48]
The research efforts on creating affect-specific models are promising for affect recognition. To enable these models to focus on high-level semantics such as the temporal dependencies among Action Units (AUs) or inter-correlations between affect dimensions, the representations provided to the models must enable generalisation—the effects of illumination variations, registration errors, head-pose variations, occlusions and identity bias must be eliminated
Illumination variations can be problematic for high-level representations that are extracted from raw pixel values
Summary
THE production, perception and interpretation of facial expressions have been analysed for a long time across various disciplines such as biology [32], psychology [38], neuroscience [40], sociology [164] and computer science [48]. While the cognitive sciences provide guidance to the question of what to encode in facial representations, computer vision and machine learning influence how to encode this information. Ongoing research suggests that the human vision system has dedicated mechanisms to perceive facial expressions [18], [139], and focuses on three types of facial perception: holistic, componential and configural perception. Configural perception models the spatial relations among facial components (e.g. left eye-right eye, mouth-nose). All these perception models might be used when we perceive expressions [2], [28], [95], [96], and they are often considered complementary [16], [165], [183]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE Transactions on Pattern Analysis and Machine Intelligence
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.