Abstract

AbstractClinical investigations have demonstrated that depression patients’ facial expression mimicry and cognitive capacities are substantially weakened, therefore their facial expressions have great uncertainty. Great uncertainty and limited data of depression patients make facial expression recognition (FER) based on depression patients a difficult endeavor. In this paper, we proposed Depression Vision Transformer (Dep-ViT) to solve the above problems. Firstly, we invited 164 subjects to participate in the Voluntary Facial Expression Mimicry (VFEM) experiment. VFEM contains seven expressions, including neutrality, anger, Disgust, Fear, happiness, sadness and surprise. We employed Person correlation analysis to characterize the action units (AUs) in VFEM at the same time. Secondly, to limit the uncertainty, each small sample in Dep-ViT had a block composed of Squeeze-Excitation (SE) and the self attention layer for the local attention information and sample importance. The sample label that received the least attention will be re-labeled using the Rank Regularization. Thirdly, in addition to the label of the VFEM itself, we manually labeled each expression image of the VFEM again, and used the manual label to assist the model training. The results showed that Dep-ViT obtains excellent results, with an accuracy of 0.417 in the VFEM.KeywordsDep-ViTFERSEManual labelVFEM

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call