Abstract

Emotion recognition is an integral part of any Human–Machine Interaction (HMI) system. Proper emotion recognition allows for HMI systems to choose the successive appropriate responses, given context and the emotion expressed by human(s). The advent of deep learning using Deep Neural Networks (DNN) has made incredible strides in achieving and even exceeding human accuracy in image classification and face detection. Many papers have been published mentioning the successful applications of DNNs like the Convolutional Neural Networks (CNNs), which have now become the de facto algorithm for facial image classification tasks because they combine the feature extraction and classification steps into one mathematical model. They learn the desirable features by themselves from the input images and have been demonstrated to be robust to variations in facial image data. However, there is one big bottleneck for CNNs: the models with good accuracy have many hidden layers, and hence are very deep and require heavy computing power, memory space and, of course, time to train themselves. Our paper puts forward two experimental approaches that can be extremely beneficial to the domains of CNNs and HMI systems. For the first approach, we were able to achieve a very good accuracy in emotion classification using a shallow CNN with only three and four hidden layers. This was possible only because we passed the input images through a carefully designed pipeline of image preprocessing techniques before feeding them to the CNN for training. For the second approach, we developed the interpretation of emotion landscape or the distribution of emotion classes detected in static images or videos with many people’s faces visible. This is similar to the group-level emotion classification studies and publications with a distinct difference in possible applications. The rationale behind this integration is in advancing the idea of studying emotions expressed by people in a group setting and how they are mutually influential, visualize the change in the emotion distribution with time and thereby form an emotional landscape in time, and enhance the understanding of collective sentiments non-verbally expressed through facial emotions in gatherings of known social context.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call