Abstract
Recent deep learning-based approaches on facial expression recognition face challenges on achieving optimal solution due to ineffective extraction of semantic expression features and difficulty in perceiving the positional relationship among facial expression features. In this paper we propose a novel framework, named Ventral-Dorsal Attention Capsule Network (VDACaps) to address the challenges. VDACaps adopts ResNet18 with a ventral channel attention and dilated spatial attention mechanism to strengthen the feature extraction. A novel attention layer containing multiple groups of ventral and dorsal attention is also designed to focus on the key features of facial expressions at the channel and spatial levels, exploring optimal sizes of receptive fields. Experiments on three well-known public expression datasets (CK+, JAFFE, and SFEW) have demonstrated that VDACaps achieves better or competitive performance on facial expression recognition against the state-of-the-art, with the accuracy of 98.98%, 98.44% and 54.07% respectively.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.