Strength Is in Numbers: Can Concordant Artificial Listeners Improve Prediction of Emotion from Speech?

Eugenio Martinelli,Corrado Di Natale,Arianna Mencattini,Elena Daprati

doi:10.1371/journal.pone.0161752

Eugenio Martinelli, Corrado Di Natale + Show 2 more

Open Access

https://doi.org/10.1371/journal.pone.0161752

Copy DOI

Journal: PLOS ONE	Publication Date: Aug 26, 2016
Citations: 6	License type: CC BY 4.0

Affiliation: University of Rome Tor Vergata

Abstract

Humans can communicate their emotions by modulating facial expressions or the tone of their voice. Albeit numerous applications exist that enable machines to read facial emotions and recognize the content of verbal messages, methods for speech emotion recognition are still in their infancy. Yet, fast and reliable applications for emotion recognition are the obvious advancement of present ‘intelligent personal assistants’, and may have countless applications in diagnostics, rehabilitation and research. Taking inspiration from the dynamics of human group decision-making, we devised a novel speech emotion recognition system that applies, for the first time, a semi-supervised prediction model based on consensus. Three tests were carried out to compare this algorithm with traditional approaches. Labeling performances relative to a public database of spontaneous speeches are reported. The novel system appears to be fast, robust and less computationally demanding than traditional methods, allowing for easier implementation in portable voice-analyzers (as used in rehabilitation, research, industry, etc.) and for applications in the research domain (such as real-time pairing of stimuli to participants’ emotional state, selective/differential data collection based on emotional content, etc.).

Highlights

One of the most irritating features of virtual receptionists is their being utterly impermeable to the emotional outbursts of callers, who, feel more neglected and less satisfied than when interacting with human attendants
We describe a novel speech emotion recognition system that applies a semi-supervised prediction model based on consensus
This approach deeply departs from procedures like active learning [10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40] and from self-training [20]

Summary

Introduction

One of the most irritating features of virtual receptionists is their being utterly impermeable to the emotional outbursts of callers, who, feel more neglected and less satisfied than when interacting with human attendants. Despite complexity of the non-verbal signals conveyed by the voice, humans recognize them, and react . Machines do not detect the emotional information embedded in the voice and, the human partner may become annoyed by the apparent lack of empathy. It is not surprising that speech emotion recognition systems (SER) have recently become of interest to the domain of human-machine interfaces [1,2], their application is relevant for treatment of psychiatric and neurologic conditions affecting the emotional sphere (e.g. autism [3,4] Parkinson Disease [5,6,7], mood disorders [8]).

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Strength Is in Numbers: Can Concordant Artificial Listeners Improve Prediction of Emotion from Speech?

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS ONE

Lead the way for us

Similar Papers

Emotional speech Recognition using CNN and Deep learning techniques
C Hema ... Fausto Pedro Garcia Marquez
Applied Acoustics | VOL. 211
C Hema, et. al.C Hema ... Fausto Pedro Garcia Marquez
28 Jun 2023
Applied Acoustics | VOL. 211

Investigation of multilingual and mixed-lingual emotion recognition using enhanced cues with data augmentation
S Lalitha ... Yousef Ajami Alotaibi
Applied Acoustics | VOL. 170
S Lalitha, et. al.S Lalitha ... Yousef Ajami Alotaibi
22 Jul 2020
Applied Acoustics | VOL. 170

Deep-Net: A Lightweight CNN-Based Speech Emotion Recognition System Using Deep Frequency Features.
Tursunov Anvarjon ... Mustaqeem
Sensors | VOL. 20
Tursunov Anvarjon, et. al.Tursunov Anvarjon ... Mustaqeem
12 Sep 2020
Sensors | VOL. 20

Speech emotion recognition systems and their security aspects
Itzik Gurowiec ... Nir Nissim
Artificial Intelligence Review | VOL. 57
Itzik Gurowiec, et. al.Itzik Gurowiec ... Nir Nissim
21 May 2024
Artificial Intelligence Review | VOL. 57

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Strength Is in Numbers: Can Concordant Artificial Listeners Improve Prediction of Emotion from Speech?

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS ONE