Abstract

Facial expression recognition in video has attracted growing attention recently. In this paper, we propose a new framework for emotion recognition with multiple feature fusion in video. Firstly, the preprocessing of the video database adopts the coarse-to-fine auto-encoder networks (CFAN) to realize the gradual optimization and alignment of the facial expression image. Secondly, in the feature extraction part, two sets of global features, the convolutional neural network (CNN) and the Gist are extracted from video clips. The local binary patterns (LBP) and the local phase quantisation from three orthogonal planes (LPQ-TOP) are extracted from the local facial image. Then discriminative multiple canonical correlation analysis (DMCCA) is used to fuse the two groups of global and local features. We use the kernel entropy component analysis (KECA) algorithm to reduce the dimension of features. Finally, support vector machine (SVM) is used to classify facial expression. Experiments on RML and SAVEE facial expression video database show that feature fusion can effectively improve expression recognition accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call