Cross-corpus acoustic emotion recognition from singing and speaking: A multi-task learning approach

Biqiao Zhang,Georg Essi,Emily Mower Provost

doi:10.1109/icassp.2016.7472790

Biqiao Zhang, Georg Essi + Show 1 more

https://doi.org/10.1109/icassp.2016.7472790

Copy DOI

Export

Save

Cite

Publication Date: Mar 1, 2016

Citations: 52

Affiliation: University of Michigan–Ann Arbor

Abstract
Full-Text
Similar Papers

Abstract

Listen

Emotion is expressed over both speech and song. Previous works have found that although spoken and sung emotion recognition are different tasks, they are related. Classifiers that explicitly utilize this relatedness can achieve better performance than classifiers that do not. Further, research in speech emotion recognition has demonstrated that emotion is more accurately modeled when gender is taken into account. However, it is not yet clear how domain (speech or song) and gender can be jointly leveraged in emotion recognition systems nor how systems leveraging this information can perform in cross-corpus settings. In this paper, we explore a multi-task emotion recognition framework and compare the performance across different classification models and output selection/fusion methods using cross-corpus evaluation. Our results show the classification accuracy is the highest when information is shared only between closely related tasks and when the output of disparate models are fused.

Full Text