Emotion Recognition With Audio, Video, EEG, and EMG: A Dataset and Baseline Approaches

Jin Chen,Tony Ro,Zhigang Zhu

doi:10.1109/access.2022.3146729

Abstract

This paper describes a new posed multimodal emotional dataset and compares human emotion classification based on four different modalities - audio, video, electromyography (EMG), and electroencephalography (EEG). The results are reported with several baseline approaches using various feature extraction techniques and machine-learning algorithms. First, we collected a dataset from 11 human subjects expressing six basic emotions and one neutral emotion. We then extracted features from each modality using principal component analysis, autoencoder, convolution network, and mel-frequency cepstral coefficient (MFCC), some unique to individual modalities. A number of baseline models have been applied to compare the classification performance in emotion recognition, including k-nearest neighbors (KNN), support vector machines (SVM), random forest, multilayer perceptron (MLP), long short-term memory (LSTM) model, and convolutional neural network (CNN). Our results show that bootstrapping the biosensor signals (i.e., EMG and EEG) can greatly increase emotion classification performance by reducing noise. In contrast, the best classification results were obtained by a traditional KNN, whereas audio and image sequences of human emotions could be better classified using LSTM.

Highlights

In daily life, emotions are abound and there are countless reasons for determining someone’s emotional state, including for better communication and work efficiency
The VGG16 feature with the long short-term memory (LSTM) model achieves a mean accuracy of 67.20%, which is over 10% better than autoencoder features and 20% better than the Principal component analysis (PCA) features
We examined various feature extraction techniques (MFCC, PCA, autoencoder and pre-trained convolutional neural network (CNN)) and machine learning models (KNN, support vector machines (SVM), Random Forest, multilayer perceptron (MLP), CNN, LSTM) for each modality

Summary

Introduction

Emotions are abound and there are countless reasons for determining someone’s emotional state, including for better communication and work efficiency. In the product development process, product features and design can be determined to be more suitable for users by analyzing the user’s emotional states during their user experience. Caregivers can provide better care to patients if the patient’s emotional states in different situations are known. Many emotion classification studies use deep learning methods in combination with state-of-the-art statistics to optimize the accuracy of detecting the emotion and attempt to integrate multiple modalities for better accuracy. With increasing attention in emotion recognition, which will be detailed in the Related Work section, many emotional datasets have been collected, including both nonphysiological signals (e.g., facial expressions and speech) and physiological signals (e.g., electroencephalogram (EEG), electromyogram (EMG), electrooculogram (EOG))

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2022
Citations: 30	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Emotion Recognition With Audio, Video, EEG, and EMG: A Dataset and Baseline Approaches

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Bimodal Anxiety State Assessment Based on Electromyography and Electroencephalogram
Diancong Zhang ... Jingxin Cao
-
Diancong Zhang, et. al.Diancong Zhang ... Jingxin Cao
15 Oct 2020
15 Oct 2020

An Investigation of Deep Learning Models for EEG-Based Emotion Recognition.
Yaqing Zhang ... Wenliang Che
Frontiers in Neuroscience | VOL. 14
Yaqing Zhang, et. al.Yaqing Zhang ... Wenliang Che
23 Dec 2020
Frontiers in Neuroscience | VOL. 14

Music emotion recognition using convolutional long short term memory deep neural networks
Serhat Hizlisoy ... Zekeriya Tufekci
Engineering Science and Technology, an International Journal | VOL. 24
Serhat Hizlisoy, et. al.Serhat Hizlisoy ... Zekeriya Tufekci
14 Nov 2020
Engineering Science and Technology, an International Journal | VOL. 24

When Old Meets New: Emotion Recognition from Speech Signals
Keith April Araño ... Carlo Vercellis
Cognitive Computation | VOL. 13
Keith April Araño, et. al.Keith April Araño ... Carlo Vercellis
19 Apr 2021
Cognitive Computation | VOL. 13

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Emotion Recognition With Audio, Video, EEG, and EMG: A Dataset and Baseline Approaches

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access