End-to-End Training for Compound Expression Recognition

Hongfei Li,Qing Li

doi:10.3390/s20174727

Abstract

For a long time, expressions have been something that human beings are proud of. That is an essential difference between us and machines. With the development of computers, we are more eager to develop communication between humans and machines, especially communication with emotions. The emotional growth of computers is similar to the growth process of each of us, starting with a natural, intimate, and vivid interaction by observing and discerning emotions. Since the basic emotions, angry, disgusted, fearful, happy, neutral, sad and surprised are put forward, there are many researches based on basic emotions at present, but few on compound emotions. However, in real life, people’s emotions are complex. Single expressions cannot fully and accurately show people’s inner emotional changes, thus, exploration of compound expression recognition is very essential to daily life. In this paper, we recommend a scheme of combining spatial and frequency domain transform to implement end-to-end joint training based on model ensembling between models for appearance and geometric representations learning for the recognition of compound expressions in the wild. We are mainly devoted to digging the appearance and geometric information based on deep learning models. For appearance feature acquisition, we adopt the idea of transfer learning, introducing the ResNet50 model pretrained on VGGFace2 for face recognition to implement the fine-tuning process. Here, we try and compare two minds, one is that we utilize two static expression databases FER2013 and RAF Basic for basic emotion recognition to fine tune, the other is that we fine tune the model on the input three channels composed of images generated by DWT2 and WAVEDEC2 wavelet transforms based on rbio3.1 and sym1 wavelet bases respectively. For geometric feature acquisition, we firstly introduce a densesift operator to extract facial key points and their histogram descriptions. After that, we introduce deep SAE with a softmax function, stacked LSTM and Sequence-to-Sequence with stacked LSTM and define their structures by ourselves. Then, we feed the salient key points and their descriptions into three models to train respectively and compare their performances. When the model training for appearance and geometric features learning is completed, we combine the two models with category labels to achieve further end-to-end joint training, considering that ensembling models, which describe different information, can further improve recognition results. Finally, we validate the performance of our proposed framework on an RAF Compound database and achieve a recognition rate of 66.97%. Experiments show that integrating different models, which express different information, and achieving end-to-end training can quickly and effectively improve the performance of the recognition.

Highlights

Human language is divided into natural language and body language
We mainly start from those two aspects, proposing a scheme of combining spatial and frequency domains to implement end-to-end joint training based on model ensembling between models for appearance and geometric representations learning for the recognition of compound expressions in the wild
We propose a scheme of combining spatial and frequency domain transforms to implement end-to-end joint training based on model ensembling between models for appearance and geometric representations learning for the recognition on compound expressions in the wild

Summary

Introduction

Human language is divided into natural language and body language. Facial expression is part of body language. As a non-linguistic signal of human beings [1], facial expressions contain rich personal information, social interaction information, and convey some information about people’s cognitive behavior, temperament, personality, authenticity, psychology, and almost all of that information cannot be replaced by other means of information expression. When people see different people’s faces, they can recognize the same expression, which is called facial expression recognition. The study of facial expression began in the 19th century. In 1872, Darwin elaborated on the connection and difference between human facial expression and animal facial expression in his famous work [2]. In 1971, Ekman and Friesen did pioneering work on modern facial expression recognition [3]

Methods

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Sensors	Publication Date: Aug 21, 2020
Citations: 9	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

End-to-End Training for Compound Expression Recognition

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Sensors

Lead the way for us

Similar Papers

Facial expression recognition based on region specific appearance and geometric features
Deepak Ghimire ... Juhwan Choi
-
Deepak Ghimire, et. al.Deepak Ghimire ... Juhwan Choi
01 Oct 2015
01 Oct 2015

Geometrical Features and Active Appearance Model Applied to Facial Expression Recognition
Flávio Altinier Maximiano Da Silva ... Helio Pedrini
International Journal of Image and Graphics | VOL. 16
Flávio Altinier Maximiano Da Silva, et. al.Flávio Altinier Maximiano Da Silva ... Helio Pedrini
01 Oct 2016
International Journal of Image and Graphics | VOL. 16

Facial expression recognition based on local region specific features and support vector machines
Deepak Ghimire ... Joonwhoan Lee
Multimedia Tools and Applications | VOL. 76
Deepak Ghimire, et. al.Deepak Ghimire ... Joonwhoan Lee
16 Mar 2016
Multimedia Tools and Applications | VOL. 76

Emotion recognition from mid-level features
David Sanchez-Mendoza ... Agata Lapedriza
Pattern Recognition Letters | VOL. 67
David Sanchez-Mendoza, et. al.David Sanchez-Mendoza ... Agata Lapedriza
14 Jun 2015
Pattern Recognition Letters | VOL. 67

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

End-to-End Training for Compound Expression Recognition

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Sensors