Multi-Modal Emotion Aware System Based on Fusion of Speech and Brain Information

Rania M Ghoniem,Abeer D Algarni,Khaled Shaalan

doi:10.3390/info10070239

Rania M Ghoniem, Abeer D Algarni + Show 1 more

Open Access

https://doi.org/10.3390/info10070239

Copy DOI

Abstract

In multi-modal emotion aware frameworks, it is essential to estimate the emotional features then fuse them to different degrees. This basically follows either a feature-level or decision-level strategy. In all likelihood, while features from several modalities may enhance the classification performance, they might exhibit high dimensionality and make the learning process complex for the most used machine learning algorithms. To overcome issues of feature extraction and multi-modal fusion, hybrid fuzzy-evolutionary computation methodologies are employed to demonstrate ultra-strong capability of learning features and dimensionality reduction. This paper proposes a novel multi-modal emotion aware system by fusing speech with EEG modalities. Firstly, a mixing feature set of speaker-dependent and independent characteristics is estimated from speech signal. Further, EEG is utilized as inner channel complementing speech for more authoritative recognition, by extracting multiple features belonging to time, frequency, and time–frequency. For classifying unimodal data of either speech or EEG, a hybrid fuzzy c-means-genetic algorithm-neural network model is proposed, where its fitness function finds the optimal fuzzy cluster number reducing the classification error. To fuse speech with EEG information, a separate classifier is used for each modality, then output is computed by integrating their posterior probabilities. Results show the superiority of the proposed model, where the overall performance in terms of accuracy average rates is 98.06%, and 97.28%, and 98.53% for EEG, speech, and multi-modal recognition, respectively. The proposed model is also applied to two public databases for speech and EEG, namely: SAVEE and MAHNOB, which achieve accuracies of 98.21% and 98.26%, respectively.

Highlights

In human–computer interaction (HCI), comprehending and discriminating emotions turned into a principal issue to construct intelligent systems that could perform purposed actions
The electroencephalogram (EEG) signals have been shown to be a robust sole modality [2,3]. The control of these bio-signals is managed by our central nervous system; it cannot be affected intentionally, while actors can pretend with emotion on their faces deliberately
This paper suggests a new hybrid emotion recognition model based on the fuzzy clustering, genetic search-based optimization

Summary

Introduction

In human–computer interaction (HCI), comprehending and discriminating emotions turned into a principal issue to construct intelligent systems that could perform purposed actions. Emotions can be discriminated utilizing distinct forms of sole modalities, like facial expression, short phrases, speech, video, EEG signals, and long/short texts. Physiological information could be utilized as supplementary to emotional information gained from facial expressions or speech to optimize the recognition rates In this regard, several multi-modal emotion recognition approaches have achieved significant contributions [4,5,6]. By reviewing the literature on the multi-modal emotion analysis, it has been observed that the majority of works have concentrated on the concatenation of feature vectors obtained from different modalities This does not take into consideration the conflicting information that may be carried by the unimodal modalities. The proposed hybrid model is likewise compared to another developed hybrid c-means-genetic algorithm-neural network model, namely, FCM − GA − NN f ixed model that relies on defining a fixed fuzzy clusters number

Related Work

Literature Review on Speech Emotion Recognition

Literature Review on EEG-Based Emotion Recognition

Literature Review on Multi-Modal Emotion Recognition

Proposed Methodology

Multimodal

Emotiv-EPOC headset with16-channel

Speech Signal Pre-Processing

EEG Signal Pre-Processing

Speech Feature Extraction

EEG Feature Extraction

The Basic Classifier

Proposed Classifier Design

Flowchart of the proposed hybrid FCM-GA-NN model for recognizing

X X λk ak

Clustering

Experimental Results

Performance Evaluation Metrics

First Experiment

Second Experiment

Third Experiment

Fourth Experiment

13.13. Flowchart multi-modal emotion recognition using speech and EEG

Computational Time Comparisons

Future Work

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Information	Publication Date: Jul 11, 2019
Citations: 7	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Multi-Modal Emotion Aware System Based on Fusion of Speech and Brain Information

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Information

Lead the way for us

Similar Papers

Multimodal feature extraction and fusion for audio-visual speech recognition

-

01 Jan 2009
01 Jan 2009

Multimodal integration-a statistical view
Lizhong Wu ... P.R Cohen
IEEE Transactions on Multimedia | VOL. 1
Lizhong Wu, et. al. Lizhong Wu ... P.R Cohen
01 Jan 1998
IEEE Transactions on Multimedia | VOL. 1

Statistical multimodal integration for intelligent HCI
Lizhong Wu ... P.R Cohen
-
Lizhong Wu, et. al. Lizhong Wu ... P.R Cohen
23 Aug 1999
23 Aug 1999

Attention-based 3D convolutional recurrent neural network model for multimodal emotion recognition.
Yiming Du ... Longlong Cheng
Frontiers in neuroscience | VOL. 17
Yiming Du, et. al.Yiming Du ... Longlong Cheng
10 Jan 2024
Frontiers in neuroscience | VOL. 17

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Multi-Modal Emotion Aware System Based on Fusion of Speech and Brain Information

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Information