Exploring Zero-Shot Emotion Recognition in Speech Using Semantic-Embedding Prototypes

Xinzhou Xu,Zixing Zhang,Li Zhao,Nicholas Cummins,Jun Deng,Bjorn W Schuller

doi:10.1109/tmm.2021.3087098

Abstract

Speech Emotion Recognition (SER) makes it possible for machines to perceive affective information. Our previous research differed from conventional SER endeavours in that it focused on recognising unseen emotions in speech autonomously through machine learning. Such a step would enable the automatic leaning of unknown emerging emotional states. This type of learning framework, however, still relied on manual annotations to obtain multiple samples of each emotion. In order to reduce this additional workload, herein, we propose a zero-shot SER framework employing a per-emotion semantic-embedding paradigm to describe emotions in zero-shot SER, instead of using the sample-wise descriptors. Aiming to optimise the relationship between emotions, prototypes, and speech samples, this framework includes two types of learning strategies: Sample-wise learning and emotion-wise learning. These strategies apply a novel learning process to speech samples and emotions, respectively, via specifically designed semantic-embedding prototypes. We verify the utility of these approaches by performing an extensive experimental evaluation on two corpora on three aspects, namely the influence of different types of learning strategies, emotional-pair comparison, and the selections of semantic-embedding prototypes and paralinguistic features. The experimental results indicate that it is applicable to use semantic-embedding prototypes for zero-shot emotion recognition in speech, despite the influence of choosing optimal strategies and prototypes.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Transactions on Multimedia	Publication Date: Jan 1, 2022
Citations: 14	License type: other-oa

R Discovery Prime

R Discovery Prime

Exploring Zero-Shot Emotion Recognition in Speech Using Semantic-Embedding Prototypes

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Multimedia

Lead the way for us

Similar Papers

Robust emotion recognition in noisy speech via sparse representation
Xiaoming Zhao ... Shiqing Zhang
Neural Computing and Applications | VOL. 24
Xiaoming Zhao, et. al.Xiaoming Zhao ... Shiqing Zhang
29 Mar 2013
Neural Computing and Applications | VOL. 24

In-depth investigation of speech emotion recognition studies from past to present –The importance of emotion recognition from speech signal for AI–
Yeşim Ülgen Sönmez ... Asaf Varol
Intelligent Systems with Applications | VOL. 22
Yeşim Ülgen Sönmez, et. al.Yeşim Ülgen Sönmez ... Asaf Varol
11 Mar 2024
Intelligent Systems with Applications | VOL. 22

Children’s recognition of emotion in music and speech
Dianna Vidas ... Genevieve A Dingle
Music & Science | VOL. 1
Dianna Vidas, et. al.Dianna Vidas ... Genevieve A Dingle
01 Jan 2018
Music & Science | VOL. 1

Progress in speech emotion recognition
Xueying Zhang ... Shufei Duan
-
Xueying Zhang, et. al.Xueying Zhang ... Shufei Duan
01 Nov 2015
01 Nov 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Exploring Zero-Shot Emotion Recognition in Speech Using Semantic-Embedding Prototypes

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Multimedia