Abstract

The success of end-to-end supervised representation learning and regression in recent years has shifted the main focus of continuous emotion recognition approaches to learning visual representations directly from large scale labeled datasets. Supervised representation learning is highly effective for learning tasks with unambiguous labels. But annotating dimensional affect data is inherently subjective which more often than not leads to ambiguous labels. Relying only on such ambiguous labels for representation learning does not result in robust features with good generalization capacity, as we will show in this work. To address this fundamental problem, we propose to apply a constrained representation learning method that encourages the latent features to be less sensitive to ambiguous emotion annotations by using a generic representation learning prior called ‘temporal coherency’ or ‘temporal smoothness’. This approach forces the latent features to be temporally coherent, by adding a first-order temporal coherency regularization constraint to the supervised learning loss function. To assess the utility of temporally coherent representations, we trained both unconstrained and temporal coherency constrained models on the Aff-wild dataset. Temporal coherency constrained models outperformed the unconstrained models by a significant margin. This performance improvement was also observed in the case of cross-dataset evaluation results on AFEW -VA database. Most notably, temporally coherent visual features produced state-of-the-art performance on the Aff-wild benchmark without using additional inputs such as facial landmarks.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.