Speech emotion recognition using hidden Markov models

Albino Nogueiras,Asunción Moreno,José B Mariño,Antonio Bonafonte

doi:10.21437/eurospeech.2001-627

Abstract

This paper introduces a first approach to emotion recognition using RAMSES, the UPC’s speech recognition system. The approach is based on standard speech recognition technology using hidden semi-continuous Markov models. Both the selection of low level features and the design of the recognition system are addressed. Results are given on speaker dependent emotion recognition using the Spanish corpus of INTERFACE Emotional Speech Synthesis Database. The accuracy recognising seven different emotions—the six ones defined in MPEG-4 plus neutral style—exceeds 80% using the best combination of low level features and HMM structure. This result is very similar to that obtained with the same database in subjective evaluation by human judges. Dealing with the speaker’s emotion is one of the latest challenges in speech technologies. Three different aspects can be easily identified: speech recognition in the presence of emotional speech, synthesis of emotional speech, and emotion recognition. In this last case, the objective is to determine the emotional state of the speaker out of the speech samples. Possible applications include from help to psychiatric diagnosis to intelligent toys, and is a subject of recent but rapidly growing interest [1]. This paper describes the TALP researchers first approach to emotion recognition. The work is inserted in the scope of the INTERFACE project [2]. The objective of this European Commission sponsored project is “to define new models and implement advanced tools for audio-video analysis, synthesis and representation in order to provide essential technologies for the implementation of large-scale virtual and augmented environments. The work is oriented to make man-machine interaction as natural as possible, based on everyday human communication by speech, facial expressions and body gestures.” In the field of emotion recognition out of speech, the main goal of the INTERFACE project will be the construction of a real-time multi-lingual speaker independent emotion recogniser. For this purpose, large speech databases with recordings from many speakers and languages are needed. As these resources are not available yet, a reduced problem will be addressed first: emotion recognition in multi-speaker language dependent conditions. Namely, this paper deals with the recognition of emotion for two Spanish speakers using standard hidden Markov models technology.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Speech emotion recognition using hidden Markov models

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Robust emotion recognition in noisy speech via sparse representation
Xiaoming Zhao ... Shiqing Zhang
Neural Computing and Applications | VOL. 24
Xiaoming Zhao, et. al.Xiaoming Zhao ... Shiqing Zhang
29 Mar 2013
Neural Computing and Applications | VOL. 24

A HMM-based Fuzzy Computing Model for Emotional Speech Recognition
Yuqiang Qin ... Xueying Zhang
-
Yuqiang Qin, et. al.Yuqiang Qin ... Xueying Zhang
01 Sep 2010
01 Sep 2010

Performance comparison of speaker and emotion recognition
A Revathy ... V Mohan
-
A Revathy, et. al.A Revathy ... V Mohan
01 Mar 2015
01 Mar 2015

Recognition of Emotions in Mexican Spanish Speech: An Approach Based on Acoustic Modelling of Emotion‐Specific Vowels
Santiago-Omar Caballero-Morales
The Scientific World Journal | VOL. 2013
Santiago-Omar Caballero-MoralesSantiago-Omar Caballero-Morales
01 Jan 2013
The Scientific World Journal | VOL. 2013

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Speech emotion recognition using hidden Markov models

Abstract

Talk to us

Similar Papers