Abstract
The recognition of dynamic facial expressions has received increasing attention since they can better reflect the real expression process of emotion than a static image. However, due to various factors such as subtle variation differences, pose, occlusion, and illumination, it has been a challenging vision task to obtain discriminative expression features in dynamic facial expression recognition. Traditional CNN-based deep learning networks lack global and temporal contextual expression understanding, which tends to affect the final recognition of dynamic expressions. Therefore, we propose an enhanced spatial–temporal learning network (ESTLNet) for more robust dynamic facial expression recognition, which consists of a spatial fusion learning module (SFLM) and a temporal transformer enhancement module (TTEM). First, the SFLM obtains a more expressive spatial feature representation through a dual-channel feature fusion learning module. Then, the TTEM extracts more valid temporal contextual expression features based on the above spatial features through an encoder constructed by cascading a self-attention learning network and an effective gated feed-forward network. Finally, the co-enhanced spatial–temporal model approach is assessed on the four broadly used dynamic expression datasets (DFEW, AFEW, CK+, and Oulu-CASIA). Extensive experimental outcomes demonstrate that our approach surpasses several existing state-of-the-art methods, leading to notable enhancements in performance.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.