Abstract
Automated emotion recognition using physiological signals has gained significant attention in recent years due to its potential applications in human–computer interaction, healthcare, and psychology. Electrocardiogram (ECG) signals are widely used due to their non-invasiveness, high temporal resolution, and direct relationship with the autonomic nervous system. In this study, we propose a novel approach for ECG-based emotion recognition using time-series to image encoding techniques and texture-based features combined with machine learning algorithms and deep learning architectures. The ECG data used in this study were obtained from the Continuously Annotated Signals of Emotion (CASE) and Wearable Stress and Affect Detection (WESAD) datasets. We categorized emotional states based on valence and arousal annotations into four classes: High-Valence High-Arousal, High-Valence Low-Arousal, Low-Valence High-Arousal, and Low-Valence Low-Arousal. The ECGs were segmented into 5 and 7-window segments and transformed into 2D representations using Gramian Angular Summation Field, Markov Transition Field, Recurrence Plot (RP), and the triple-channel fusion of all these images. Total of 85 textural features based on the Gray-Level Co-occurrence Matrix (GLCM), Gray-Level Run Length Matrix, Zernike’s Moments, Hu’s Moments, Fractal Dimension Texture Analysis (FDTA), and First-Order Statistics were extracted. Classifiers, namely Random Forest, Support Vector Machine (SVM), eXtreme Gradient Boosting (XGB), 1D Convolutional Neural Network (CNN), and Multi-head Attention Network were considered to classify the emotional states. The performance of the classifiers varied depending on the time-series to image encoding technique, segmentation approaches, and the classifier employed. We achieved the highest Weighted F-measure (F-m) of 94.91% (RP + XGB) and 86.78% (RP + SVM) using the 7-window and 5-window approaches, respectively. Our proposed 1D CNN architecture achieved the highest classification metrics (F-m = 92.52%, Balanced accuracy = 92.0%, Recall = 91.96%, and Precision = 93.16%) with RP images in a 7-window approach. GLCM and FDTA features made significant contributions to the classification of emotional states. Overall, our results suggest that the proposed method holds promise for developing more accurate and efficient emotion recognition systems.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.