Abstract

In allusion to the shortcomings of traditional facial expression recognition (FER) that only uses a single feature and the recognition rate is not high, a FER method based on fusion of transformed multilevel features and improved weighted voting SVM (FTMS) is proposed. The algorithm combines the transformed traditional shallow features and convolutional neural network (CNN) deep semantic features and uses an improved weighted voting method to make a comprehensive decision on the results of the four trained SVM classifiers to obtain the final recognition result. The shallow features include local Gabor features, LBP features, and joint geometric features designed in this study, which are composed of distance and deformation characteristics. The deep feature of CNN is the multilayer feature fusion of CNN proposed in this study. This study also proposes to use a better performance SVM classifier with CNN to replace Softmax since the poor distinction between facial expressions. Experiments on the FERPlus database show that the recognition rate of this method is 17.2% higher than that of the traditional CNN, which proves the effectiveness of the fusion of the multilayer convolutional layer features and SVM. FTMS-based facial expression recognition experiments are carried out on the JAFFE and CK+ datasets. Experimental results show that, compared with the single feature, the proposed algorithm has higher recognition rate and robustness and makes full use of the advantages and characteristics of different features.

Highlights

  • facial expression recognition (FER) refers to the use of computers to analyze human facial expressions and judge human psychology and emotions through pattern recognition and machine learning algorithms, thereby achieving intelligent human-computer interaction [1]

  • After proving the effectiveness of our proposed fusion of multilayer convolutional layer features as convolutional neural network (CNN) deep features, in Section 4.2, the features and improved weighted voting SVM (FTMS)-based expression recognition experiment was carried out with JAFFE [41] and CK+ [42] databases; the results were compared and analyzed

  • A well-trained SVM classifier based on four features was tested on the test set. e final recognition rate results of the four features in the two databases are shown in Figures 17 and 18

Read more

Summary

Introduction

FER refers to the use of computers to analyze human facial expressions and judge human psychology and emotions through pattern recognition and machine learning algorithms, thereby achieving intelligent human-computer interaction [1]. Traditional FER methods generally include three steps: face detection, feature extraction, and expression recognition [2, 3]. E most important part is feature extraction, which directly affects the final recognition result. Texture features commonly used in FER include Gabor and LBP. E Gabor filter has the same characteristics as the receptive field of visual cells and has the ability to analyze subtle changes in images from multiple scales and directions [4]. LBP is a texture operator which can effectively describe the local information of gray scale images [5]. In order to reduce the dimensionality, we often extract histogram features from the LBP feature map instead of directly using the feature map for classification [6]. Geometric feature is to locate key feature points in important feature areas of human face (such as the eyebrows, eyes, nose, and mouth) and calculate the distance and angle between them [7]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call