Using the Bag-of-Audio-Words approach for emotion recognition

Mercedes Vetráb,Gábor Gosztolya

doi:10.2478/ausi-2022-0001

Mercedes Vetráb, Gábor Gosztolya

Open Access

https://doi.org/10.2478/ausi-2022-0001

Copy DOI

Abstract

Abstract The problem of varying length recordings is a well-known issue in paralinguistics. We investigated how to resolve this problem using the bag-of-audio-words feature extraction approach. The steps of this technique involve preprocessing, clustering, quantization and normalization. The bag-of-audio-words technique is competitive in the area of speech emotion recognition, but the method has several parameters that need to be precisely tuned for good efficiency. The main aim of our study was to analyse the effectiveness of bag-of-audio-words method and try to find the best parameter values for emotion recognition. We optimized the parameters one-by-one, but built on the results of each other. We performed the feature extraction, using openSMILE. Next we transformed our features into same-sized vectors with openXBOW, and finally trained and evaluated SVM models with 10-fold-crossvalidation and UAR. In our experiments, we worked with a Hungarian emotion database. According to our results, the emotion classification performance improves with the bag-of-audio-words feature representation. Not all the BoAW parameters have the optimal settings but later we can make clear recommendations on how to set bag-of-audio-words parameters for emotion detection tasks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Using the Bag-of-Audio-Words approach for emotion recognition

Abstract

Talk to us

Similar Papers

More From: Acta Universitatis Sapientiae, Informatica

Lead the way for us

Journal: Acta Universitatis Sapientiae, Informatica	Publication Date: Aug 1, 2022
License type: CC BY-NC-ND 4.0

Similar Papers

A survey of state-of-the-art approaches for emotion recognition in text
Nourah Alswaidan ... Mohamed El Bachir Menai
Knowledge and Information Systems | VOL. 62
Nourah Alswaidan, et. al.Nourah Alswaidan ... Mohamed El Bachir Menai
18 Mar 2020
Knowledge and Information Systems | VOL. 62

A 3D-convolutional neural network framework with ensemble learning techniques for multi-modal emotion recognition
Elham S Salama ... Mohamed A Wahby Shalaby
Egyptian Informatics Journal | VOL. 22
Elham S Salama, et. al.Elham S Salama ... Mohamed A Wahby Shalaby
13 Aug 2020
Egyptian Informatics Journal | VOL. 22

Primary Emotions and Recognition of Their Intensities
Rim Afdhal ... Ridha Ejbali
The Computer Journal | VOL. 64
Rim Afdhal, et. al.Rim Afdhal ... Ridha Ejbali
30 Jan 2020
The Computer Journal | VOL. 64

DeepCNN: Spectro‐temporal feature representation for speech emotion recognition
Nasir Saleem ... Jiechao Gao
CAAI Transactions on Intelligence Technology | VOL. 8
Nasir Saleem, et. al.Nasir Saleem ... Jiechao Gao
26 May 2023
CAAI Transactions on Intelligence Technology | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Using the Bag-of-Audio-Words approach for emotion recognition

Abstract

Talk to us

Similar Papers

More From: Acta Universitatis Sapientiae, Informatica