Abstract

To address the limitations of databases in the field of emotion recognition and to cater to the trend of integrating data from multiple sources, we have established a multi-modal emotional dataset based on spontaneous expression of drivers. By selecting emotional induction materials and inducing emotions before each driving task, facial expression videos and synchronous physiological signals of the drivers during driving were collected. The dataset includes records of 64 participants under five different emotions (neutral, happy, angry, sad, and fear), and the emotional valence, arousal, and peak time of all participants in each driving task were recorded. To analyze the dataset, spatio-temporal convolutional neural networks were designed to analyze the different modalities of data with varying durations in the dataset, aiming to investigate their performance in emotion recognition. The results demonstrate that the fusion of multi-modal data significantly improves the accuracy of driver's emotion recognition, with accuracy increases of 11.28% and 6.83% compared to using only facial video signals or physiological signals, respectively. Therefore, the publication and analysis of multi-modal emotional data for driving scenarios is crucial to support further research in the fields of multimodal perception and intelligent transportation engineering.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call