Abstract

Novel trends in affective computing are based on reliable sources of physiological signals such as Electroencephalogram (EEG), Electrocardiogram (ECG), and Galvanic Skin Response (GSR). The use of these signals provides challenges of performance improvement within a broader set of emotion classes in a less constrained real-world environment. To overcome these challenges, we propose a computational framework of 2D Convolutional Neural Network (CNN) architecture for the arrangement of 14 channels of EEG, and a combination of Long Short-Term Memory (LSTM) and 1D-CNN architecture for ECG and GSR. Our approach is subject-independent and incorporates two publicly available datasets of DREAMER and AMIGOS with low-cost, wearable sensors to extract physiological signals suitable for real-world environments. The results outperform state-of-the-art approaches for classification into four classes, namely High Valence—High Arousal, High Valence—Low Arousal, Low Valence—High Arousal, and Low Valence—Low Arousal. Emotion elicitation average accuracy of is achieved with ECG right-channel modality, 76.65% with EEG modality, and 63.67% with GSR modality for AMIGOS. The overall highest accuracy of 99.0% for the AMIGOS dataset and 90.8% for the DREAMER dataset is achieved with multi-modal fusion. A strong correlation between spectral- and hidden-layer feature analysis with classification performance suggests the efficacy of the proposed method for significant feature extraction and higher emotion elicitation performance to a broader context for less constrained environments.

Highlights

  • Recent trends in the field of affective computing have shifted towards a more reliable source of physiological signals [1,2,3,4] such as Electroencephalogram (EEG), Electrocardiogram (ECG), and Galvanic Skin Response (GSR) due to their significance in human–computer interaction (HCI).Emotions can be distinctively expressed as a non-verbal form of everyday social interaction.These non-verbal cues are generally reflected through facial expressions and tone of voice. involuntary physiological responses to the emotional stimuli are more reliable compared to voluntary response, because the involuntary response cannot be masked intentionally [4]

  • We proposed the combination of Long Short-Term Memory (LSTM) and Convolutional Neural Network (CNN) architecture to improve the recognition performance for four classes (HVHA, HVLA, Low Valence—High Arousal (LVHA), and Low Valence—Low Arousal (LVLA)) of emotion with multi-modal fusion while exploiting the significance of various modalities such as ECG, EEG, and GSR

  • For the EEG approach, 13,642 images in the case of AMIGOS and 7452 images in the case of DREAMER were randomly selected as test data, while the remaining randomly selected images of these datasets were used as a training set of images

Read more

Summary

Introduction

Involuntary physiological responses (such as EEG and ECG) to the emotional stimuli are more reliable compared to voluntary response (such as sound or facial expressions), because the involuntary response cannot be masked intentionally [4] (a sad person may smile, which may be the indication of depression). External factors such as lighting conditions, accessories like. Biosensors help to monitor and collect physiological signals from heart (ECG), brain (EEG), or skin (GSR), and proves to be the most significant for the detection of stress levels and emotions [6,7]. Applications of physiological signal-based emotion recognition encompass psychological health-care monitoring for hospitalized patients [8], real-time stress-level detection of drivers, emotion-inspired multimedia applications [9], various bio-inspired human–machine interfaces, and health-care applications [10]

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call