Abstract

Micro-expression recognition is a substantive cross-study of psychology and computer science, and it has a wide range of applications (e.g., psychological and clinical diagnosis, emotional analysis, criminal investigation, etc.). However, the subtle and diverse changes in facial muscles make it difficult for existing methods to extract effective features, which limits the improvement of micro-expression recognition accuracy. Therefore, we propose a multi-scale joint feature network based on optical flow images for micro-expression recognition. First, we generate an optical flow image that reflects subtle facial motion information. The optical flow image is then fed into the multi-scale joint network for feature extraction and classification. The proposed joint feature module (JFM) integrates features from different layers, which is beneficial for the capture of micro-expression features with different amplitudes. To improve the recognition ability of the model, we also adopt a strategy for fusing the feature prediction results of the three JFMs with the backbone network. Our experimental results show that our method is superior to state-of-the-art methods on three benchmark datasets (SMIC, CASME II, and SAMM) and a combined dataset (3DB).

Highlights

  • Micro-expressions are brief facial expressions that people unconsciously make when they try to hide a real emotion

  • Liu et al [8] put forward the main directional mean optical-flow (MDMO) method based on regions of interest, which reduced the dimensionality of features and improved the robustness of micro-expression recognition

  • It can be seen that the color change in areas where the micro-expression appears is more dramatic. 3.2 Multi-scale joint feature network we introduce the details of the proposed network structure for micro-expression recognition

Read more

Summary

Introduction

Micro-expressions are brief facial expressions that people unconsciously make when they try to hide a real emotion. Ekman and Friesen [2] analyzed a conversation video between a psychiatrist and a depressed patient, and observed that painful expressions occasionally appeared during the patient’s smile, which he called micro-expressions This was the first time that the term microexpression was used. Early deep learning models combined convolutional neural networks (CNN) and long short-term memory (LSTM) for micro-expression recognition. The complexity of this model can lead to overfitting problems in the training process when there are insufficient samples. Some research works [12, 13] tend to use shallow multi-stream networks to improve the performance of the model on small datasets and in class-imbalanced situations. We compare the proposed method with state-of-the-art methods; the results show that the performance of our model is competitive on three benchmark datasets and a crossdataset

Related work
Traditional methods
Deep learning methods
Method
Optical flow image
Backbone network
Joint feature module
Fusion strategy
Experiments
Datasets
Metrics
Implementation details
Ablation study
Comparisons
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call