Abstract
While deep learning has achieved state-of-the-art results on many computer vision tasks it is still challenged when interpreting human facial expressions, namely: poor generalisation ability of models across datasets, failure to account for individual differences in similar emotional states, and inability to recognise compound facial expressions and low-intensity or subtle emotional states. This study analyses how the resolution of face images that are input to various Convolutional Neural Network (CNN) models impacts on their ability to recognise compound and low-intensity emotions. Several high-resolution facial expression databases were combined to compile a simple dataset containing high-intensity emotions and a complex data set consisting of compound and low-intensity emotions. In the experiments, standard pre-trained CNN models that were further fine-tuned achieved higher validation accuracies than CNN models that were trained from scratch on the simple data set. However, when tested on the complex data the models trained from scratch generalised better than the refined pre-trained models. Using a technique of output visualisation we could show how our high-resolution CNN models were able to generalise to the complex data where they utilised small facial features that previously were not detectable.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.