Expressions In Video Sequences Research Articles

Automatically recognizing facial expressions in video sequences is a challenging task because there is little direct correlation between facial features and subjective emotions in video. To overcome the problem, a video facial expression recognition method using spatiotemporal recurrent neural network and feature fusion is proposed. Firstly, the video is preprocessed. Then, the double-layer cascade structure is used to detect a face in a video image. In addition, two deep convolutional neural networks are used to extract the time-domain and airspace facial features in the video. The spatial convolutional neural network is used to extract the spatial information features from each frame of the static expression images in the video. The temporal convolutional neural network is used to extract the dynamic information features from the optical flow information from multiple frames of expression images in the video. A multiplication fusion is performed with the spatiotemporal features learned by the two deep convolutional neural networks. Finally, the fused features are input to the support vector machine to realize the facial expression classification task. The experimental results on cNTERFACE, RML, and AFEW6.0 datasets show that the recognition rates obtained by the proposed method are as high as 88.67%, 70.32%, and 63.84%, respectively. Comparative experiments show that the proposed method obtains higher recognition accuracy than other recently reported methods.

Read full abstract

Video-based facial expression recognition (FER) has recently received increased attention as a result of its widespread application. Using only one type of feature to describe facial expression in video sequences is often inadequate, because the information available is very complex. With the emergence of different features to represent different properties of facial expressions in videos, an appropriate combination of these features becomes an important, yet challenging, problem. Considering that the dimensionality of these features is usually high, we thus introduce multiview dimension reduction (MVDR) into video-based FER. In MVDR, it is critical to explore the relationships between and within different feature views. To achieve this goal, we propose a novel framework of MVDR by enforcing joint structured sparsity at both inter- and intraview levels. In this way, correlations on and between the feature spaces of different views tend to be well-exploited. In addition, a transformation matrix is learned for each view to discover the patterns contained in the original features, so that the different views are comparable in finding a common representation. The model can be not only performed in an unsupervised manner, but also easily extended to a semisupervised setting by incorporating some domain knowledge. An alternating algorithm is developed for problem optimization, and each subproblem can be efficiently solved. Experiments on two challenging video-based FER datasets demonstrate the effectiveness of the proposed framework.

Read full abstract

Expressions In Video Sequences Research Articles

Articles published on Expressions In Video Sequences

Clip-aware expressive feature learning for video-based facial expression recognition

Video Expression Recognition Method Based on Spatiotemporal Recurrent Neural Network and Feature Fusion

Learning the spatiotemporal variability in longitudinal shape data sets

Video-Based Facial Expression Recognition using Deep Temporal–Spatial Networks

Video facial emotion recognition based on local enhanced motion history image and CNN-CTSLSTM networks

A dynamic framework based on local Zernike moment and motion history image for facial expression recognition

Joint Structured Sparsity Regularized Multiview Dimension Reduction for Video-Based Facial Expression Recognition

Smooth adaptive fitting of 3D face model for the estimation of rigid and nonrigid facial motion in video sequences

Recognising facial expressions in video sequences

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Expressions In Video Sequences Research Articles

Articles published on Expressions In Video Sequences

Clip-aware expressive feature learning for video-based facial expression recognition

Video Expression Recognition Method Based on Spatiotemporal Recurrent Neural Network and Feature Fusion

Learning the spatiotemporal variability in longitudinal shape data sets

Video-Based Facial Expression Recognition using Deep Temporal–Spatial Networks

Video facial emotion recognition based on local enhanced motion history image and CNN-CTSLSTM networks

A dynamic framework based on local Zernike moment and motion history image for facial expression recognition

Joint Structured Sparsity Regularized Multiview Dimension Reduction for Video-Based Facial Expression Recognition

Smooth adaptive fitting of 3D face model for the estimation of rigid and nonrigid facial motion in video sequences

Recognising facial expressions in video sequences