A Neural Network Architecture for Children’s Audio–Visual Emotion Recognition

Anton Matveev,Yuri Matveev,Elena Lyakso,Olga Frolova,Aleksandr Nikolaev

doi:10.3390/math11224573

Abstract

Detecting and understanding emotions are critical for our daily activities. As emotion recognition (ER) systems develop, we start looking at more difficult cases than just acted adult audio–visual speech. In this work, we investigate the automatic classification of the audio–visual emotional speech of children, which presents several challenges including the lack of publicly available annotated datasets and the low performance of the state-of-the art audio–visual ER systems. In this paper, we present a new corpus of children’s audio–visual emotional speech that we collected. Then, we propose a neural network solution that improves the utilization of the temporal relationships between audio and video modalities in the cross-modal fusion for children’s audio–visual emotion recognition. We select a state-of-the-art neural network architecture as a baseline and present several modifications focused on a deeper learning of the cross-modal temporal relationships using attention. By conducting experiments with our proposed approach and the selected baseline model, we observe a relative improvement in performance by 2%. Finally, we conclude that focusing more on the cross-modal temporal relationships may be beneficial for building ER systems for child–machine communications and environments where qualified professionals work with children.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Neural Network Architecture for Children’s Audio–Visual Emotion Recognition

Abstract

Talk to us

Similar Papers

More From: Mathematics

Lead the way for us

Journal: Mathematics	Publication Date: Nov 7, 2023
License type: CC BY 4.0

Similar Papers

Joint low rank embedded multiple features learning for audio–visual emotion recognition
Zhan Wang ... Hua Huang
Neurocomputing | VOL. 388
Zhan Wang, et. al.Zhan Wang ... Hua Huang
12 Jan 2020
Neurocomputing | VOL. 388

Audio-Visual Emotion Recognition Based on Hidden Markov Model
Jingxuan Zhao ... Dongmei Jiang
-
Jingxuan Zhao, et. al.Jingxuan Zhao ... Dongmei Jiang
01 Jan 2012
01 Jan 2012

Proceedings of the 7th Annual Workshop on Audio/Visual Emotion Challenge
...
-
, et. al. ...
23 Oct 2017
23 Oct 2017

HiCMAE: Hierarchical Contrastive Masked Autoencoder for self-supervised Audio-Visual Emotion Recognition
Licai Sun ... Jianhua Tao
Information Fusion | VOL. 108
Licai Sun, et. al.Licai Sun ... Jianhua Tao
26 Mar 2024
Information Fusion | VOL. 108

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Neural Network Architecture for Children’s Audio–Visual Emotion Recognition

Abstract

Talk to us

Similar Papers

More From: Mathematics