Abstract
Recognizing facial expressions plays a crucial role in various multimedia applications, such as human–computer interactions and the functioning of autonomous vehicles. This paper introduces a hybrid feature extraction network model to bolster the discriminative capacity of emotional features for multimedia applications. The proposed model comprises a convolutional neural network (CNN) and deep belief network (DBN) series. First, a spatial CNN network processed static facial images, followed by a temporal CNN network. The CNNs were fine-tuned based on facial expression recognition (FER) datasets. A deep belief network (DBN) model was then applied to integrate the segment-level spatial and temporal features. Deep fusion networks were jointly used to learn spatiotemporal features for discrimination purposes. Due to its generalization capabilities, we used a multi-class support vector machine classifier to classify the seven basic emotions in the proposed model. The proposed model exhibited 98.14% recognition performance for the JaFFE database, 95.29% for the KDEF database, and 98.86% for the RaFD database. It is shown that the proposed method is effective for all three databases, compared with the previous schemes for JAFFE, KDEF, and RaFD databases.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.