Microexpressions are subtle facial movements that occur within an extremely brief time frame, often revealing suppressed emotions. These expressions hold significant importance across various fields, including security monitoring and human–computer interaction. However, the accuracy of microexpression recognition is severely constrained by the inherent characteristics of these expressions. To address the issue of low detection accuracy regarding the subtle features present in microexpressions’ facial action units, this paper proposes a microexpression action unit detection algorithm, Attention-embedded Dual Path and Shallow Three-stream Networks (ADP-DSTN), that incorporates an attention-embedded dual path and a shallow three-stream network. First, an attention mechanism was embedded after each Bottleneck layer in the foundational Dual Path Networks to extract static features representing subtle texture variations that have significant weights in the action units. Subsequently, a shallow three-stream 3D convolutional neural network was employed to extract optical flow features that were particularly sensitive to temporal and discriminative characteristics specific to microexpression action units. Finally, the acquired static facial feature vectors and optical flow feature vectors were concatenated to form a fused feature vector that encompassed more effective information for recognition. Each facial action unit was then trained individually to address the issue of weak correlations among the facial action units, thereby facilitating the classification of microexpression emotions. The experimental results demonstrated that the proposed method achieved great performance across several microexpression datasets. The unweighted average recall (UAR) values were 80.71%, 89.55%, 44.64%, 80.59%, and 88.32% for the SAMM, CASME II, CAS(ME)3, SMIC, and MEGC2019 datasets, respectively. The unweighted F1 scores (UF1) were 79.32%, 88.30%, 43.03%, 81.12%, and 88.95%, respectively. Furthermore, when compared to the benchmark model, our proposed model achieved better performance with lower computational complexity, characterized by a Floating Point Operations (FLOPs) value of 1087.350 M and a total of 6.356 × 106 model parameters.
Read full abstract