Abstract

People tend to focus on changes in a certain complex human affect in the majority of practical applications of affective computing. Facial expression classification models are unable to represent all human affects through a limited number of expression categories. In this backdrop, this paper studies the Sequence-level affective level estimation (S-ALE), which is more relevant to real scenarios and can depict individual affective level in continuous manner. A spatio-temporal framework applied to S-ALE is proposed, which consists of a Facial Expression Features Pyramid Network (FEFPN) and a Temporal Transformer Encoder (TTE). FEFPN is capable of extracting pyramidal facial expression features, while TTE can effectively capture coarse-grained and fine-grained temporal variations of facial sequences. The proposed model is evaluated on six public datasets across three typical S-ALE tasks (engagement prediction, fatigue detection, and pain assessment), and experimental results show that our method is comparable to or outperforms the state-of-the-art algorithms.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.