Abstract

Recently, facial expression recognition (FER) methods have achieved significant progress. However, FER is still challenged by factors such as uneven illumination and low-quality expression images. Exploring the potential of facial expression features to achieve robust FER is particularly important. Inspired by the Transformer’s excellent performance in modeling long-range dependencies and in vision tasks, this paper proposes a Transformer Block Enhancement Module (TBEM) for enhancing the feature representation of facial expressions. The proposed module contains a Channel Enhancement (CE) block and a Spatial Enhancement (SE) block. The CE block can adaptively enhance the expression features on the channel dimension by effectively leveraging the channel dependency information, while the SE block enhances the expression features on the spatial dimension by integrating the spatial dependency information. TBEM can output a more robust expression representation by combining CE and SE. The proposed artificial intelligence learning module greatly improves the recognition accuracy of FER engineering tasks. To further illustrate the application of TBEM in real-world FER engineering, three engineering problems are used for verification. Extensive experiments demonstrate that the proposed method improves FER performance by focusing on more accurate decisional features and can be easily embedded into regular convolutional neural network models to help them improve the accuracy on FER tasks by about 2.64%–3.03%. The results show the proposed method achieves accuracies of 90.57% on FERPlus, 89.41% on RAFDB basic and 68.43% on RAFDB compound, respectively. The proposed method also provides a meaningful reference for further research on applying Transformer to other tasks.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call