Dep-ViT: Uncertainty Suppression Model Based on Facial Expression Recognition in Depression Patients

Jiayu Ye,Guanwei Cheng,Yang Liu,Gang Fu,Qingxiang Wang

doi:10.1007/978-3-031-15934-3_10

Abstract

AbstractClinical investigations have demonstrated that depression patients’ facial expression mimicry and cognitive capacities are substantially weakened, therefore their facial expressions have great uncertainty. Great uncertainty and limited data of depression patients make facial expression recognition (FER) based on depression patients a difficult endeavor. In this paper, we proposed Depression Vision Transformer (Dep-ViT) to solve the above problems. Firstly, we invited 164 subjects to participate in the Voluntary Facial Expression Mimicry (VFEM) experiment. VFEM contains seven expressions, including neutrality, anger, Disgust, Fear, happiness, sadness and surprise. We employed Person correlation analysis to characterize the action units (AUs) in VFEM at the same time. Secondly, to limit the uncertainty, each small sample in Dep-ViT had a block composed of Squeeze-Excitation (SE) and the self attention layer for the local attention information and sample importance. The sample label that received the least attention will be re-labeled using the Rank Regularization. Thirdly, in addition to the label of the VFEM itself, we manually labeled each expression image of the VFEM again, and used the manual label to assist the model training. The results showed that Dep-ViT obtains excellent results, with an accuracy of 0.417 in the VFEM.KeywordsDep-ViTFERSEManual labelVFEM

Full Text