The correct recognition of each person's emotion can be an important step in knowing mental states and reactions. Because of the complexities of human personality, this is a difficult task and providing a method that can classify emotions with high accuracy can be useful. The purpose of doing this work is to provide a suitable method for recognizing emotions in human subjects. In other words, in this research the aim is to analyze eye tracking data using deep learning approaches and for this purpose, deep neural networks are used to classify fixations in different emotional states. The eye data of 10 women and 7 men who watched 79 emotional images are first collected. In this way, for each fixation point obtained from the subjects, a patch of 80 x 80 pixels was selected from the images and we used these patches of the obtained image as the input to the deep neural networks. The images shown to the subjects have three basic dimensions of emotion, i.e. Valence, Arousal and Dominance. Data classification was done by transfer learning and using Xception, InceptionV3 and DenseNet201 deep neural networks whose weights were trained on ImageNet datasets. We used these three neural networks in our work because they have relatively equal number of parameters and will help us in better analysis of the obtained results. In general, the results show that the points obtained from subjects' fixations location on images with three basic dimensions of emotion can be used to classify emotions. We will also see that changing the size of the patches obtained from the subject's fixation will not have much effect on the test results, which shows that deep neural networks are a powerful and accurate tool for processing people's eye fixation points.
Read full abstract