Abstract

Facial expression recognition is an interesting research direction of pattern recognition and computer vision. It has been increasingly used in artificial intelligence, human–computer interaction and security monitoring. In recent years, Convolution Neural Network (CNN) as a deep learning technique and multiple classifier combination method has been applied to gain accurate results in classifying face expressions. In this paper, we propose a multimodal classification approach based on a local texture descriptor representation and a combination of CNN to recognize facial expression. Initially, in order to reduce the influence of redundant information, the preprocessing stage is performed using face detection, face image cropping and texture descriptors of Local Binary Pattern (LBP), Local Gradient Code (LGC), Local Directional Pattern (LDP) and Gradient Direction Pattern (GDP) calculation. Second, we construct a cascade CNN architecture using the multimodal data of each descriptor (CNNLBP, CNNLGC, CNNGDP and CNNLDP) to extract facial features and classify emotions. Finally, we apply aggregation techniques (sum and product rule) for each modality to combine the four multimodal outputs and thus obtain the final decision of our system. Experimental results using CK[Formula: see text] and JAFFE database show that the proposed multimodal classification system achieves superior recognition performance compared to the existing studies with classification accuracy of 97, 93% and 94, 45%, respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call