DV Emotion Net: An Integrated Multimodal Approach for Emotion Recognition

Dommeti D,Nallapati Srk,Alfaris R

doi:10.23880/oajcij-16000112

Abstract

This study introduces a novel approach to emotion recognition by amalgamating information from heterogeneous modalities, specifically audio and video. We employed techniques such as energy, zero crossing rate, and Mel-Frequency Cepstral Coefficients (MFCC) for audio feature extraction, which showed promising results. For video feature extraction, spatialtemporal Gaussian kernels were used to organize video frames within a linear scale space, followed by the application of a Gaussian-weighted function to the second momentum matrix for further feature extraction. The Multimodal Feature Aggregation (MFA) fusion method was employed to unify audio and video features, resulting in a comprehensive dataset. Evaluation through the Fusion of Emotion Recognition Convolutional Neural Network (FERCNN) model, supported by the "TPU VM v3-8" accelerator TPU is a Tensor Processing Unit, showcased notable performance improvements. Using the RAVDESS and CREMAD datasets, accuracies of 94.66%, 95.82%, and 94.36% in the RAVDESS dataset and 79.45%, 96.62%, and 70.14% in the CREMAD dataset for audio, video, and multimodal modalities, respectively, were achieved. These outcomes surpass the capabilities of existing multimodal systems, underscoring the efficacy of our proposed approach. Emotion recognition, particularly through multimodal means, plays a critical role in various domains, including human-computer interfaces, healthcare, legal proceedings, and entertainment. Fusing Audio and Video Modalities to Elevate Human-Computer Interaction and Intelligent System Performance is essential for enhancing communication within these domains. The proposed model is termed "DualVision EmotionNet: DV EmotionNet".

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

DV Emotion Net: An Integrated Multimodal Approach for Emotion Recognition

Abstract

Talk to us

Similar Papers

More From: Open Access Journal of Criminology Investigation & Justice

Lead the way for us

Similar Papers

Human Emotion Recognition by Integrating Facial and Speech Features: An Implementation of Multimodal Framework using CNN
P V V S Srinivas ... Pragnyaban Mishra
International Journal of Advanced Computer Science and Applications | VOL. 13
P V V S Srinivas, et. al.P V V S Srinivas ... Pragnyaban Mishra
01 Jan 2021
International Journal of Advanced Computer Science and Applications | VOL. 13

Fusion of mel and gammatone frequency cepstral coefficients for speech emotion recognition using deep C-RNN
U Kumaran ... Senthil Murugan Nagarajan
International Journal of Speech Technology | VOL. 24
U Kumaran, et. al.U Kumaran ... Senthil Murugan Nagarajan
13 Jan 2021
International Journal of Speech Technology | VOL. 24

Recognition of Emotions of Speech and Mood of Music: A Review
Gaurav Agarwal ... Sushila Maheshkar
-
Gaurav Agarwal, et. al.Gaurav Agarwal ... Sushila Maheshkar
01 Jan 2018
01 Jan 2018

Exploitation of phased-based features for emotional arousal evaluation from speech
Igor Guoth ... Sakhia Darjaa
The Journal of the Acoustical Society of America | VOL. 141
Igor Guoth, et. al.Igor Guoth ... Sakhia Darjaa
01 May 2017
The Journal of the Acoustical Society of America | VOL. 141

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

DV Emotion Net: An Integrated Multimodal Approach for Emotion Recognition

Abstract

Talk to us

Similar Papers

More From: Open Access Journal of Criminology Investigation &amp; Justice

More From: Open Access Journal of Criminology Investigation & Justice