Facial Expression Recognition in the Presence of Speech Using Blind Lexical Compensation

Soroosh Mariooryad,Carlos Busso

doi:10.1109/taffc.2015.2490070

Soroosh Mariooryad, Carlos Busso

Open Access

https://doi.org/10.1109/taffc.2015.2490070

Copy DOI

Abstract

During spontaneous conversations the articulation process as well as the internal emotional states influence the facial configurations. Inferring the conveyed emotions from the information presented in facial expressions requires decoupling the linguistic and affective messages in the face. Normalizing and compensating for the underlying lexical content have shown improvement in recognizing facial expressions. However, this requires the transcription and phoneme alignment information, which is not available in broad range of applications. This study uses the asymmetric bilinear factorization model to perform the decoupling of linguistic and affective information when they are not given. The emotion recognition evaluations on the IEMOCAP database show the capability of the proposed approach in separating these factors in facial expressions, yielding statistically significant performance improvements. The achieved improvement is similar to the case when the ground truth phonetic transcription is known. Similarly, experiments on the SEMAINE database using image-based features demonstrate the effectiveness of the proposed technique in practical scenarios.

Full Text