Abstract

Facial micro-expression is often used for emotional recognition of people in a high-risk or pressure scene, which may reflect genuine emotions due to the low intensity of facial action units. Current methods focus on locating regions with emotional changes and cropping these regions for local feature extraction. However, these methods may lead to the problem of information redundancy caused by overlapping cropped regions. This paper proposes a novel three-dimensional convolutional neural network embedding in the transformer model (C3DBed). This model learns the attention weight of each local region of the micro-expression image, thereby perceiving the detail changes of the facial image and extracting robust local detail features. Solve the problem of model complexity and information redundancy caused by low-intensity local area positioning of facial muscle movement. The experiment results demonstrated that the proposed C3DBed model achieved competitive performance with accuracy rates of 78.04%, 77.64%, and 75.73% on SMIC, CASME II, and SAMM datasets, respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call