Abstract

There has been a growing focus on the use of artificial intelligence and machine learning for affective computing to further enhance user experience through emotion recognition. Typically, machine learning models used for affective computing are trained using manually extracted features from biological signals. Such features may not generalize well for large datasets. One approach to address this issue is to use fully supervised deep learning methods to learn latent representations. However, this method requires human supervision to label the data, which may be unavailable. In this work we propose an unsupervised framework for representation learning. The proposed framework utilizes two stacked convolutional autoencoders to learn latent representations from wearable electrocardiogram and electrodermal activity signals. The representations learned from this unsupervised framework are subsequently utilized within a random forest model to classify arousal. To validate this framework, an aggregation of the AMIGOS, ASCERTAIN, CLEAS, and MAHNOB-HCI datasets is created. The results of our proposed method are compared with other methods including convolutional neural networks, as well as methods that employ manual extraction of features. We show that our method outperforms current state-of-the-art results. The results show the wide-spread applicability for stacked convolutional autoencoders to be used for affective computing.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call