Abstract

Facial expressions are the most common medium for expressing human emotions. Due to the wide range of real-world applications, facial expression understanding has received extensive attention from researchers. One of the most vital issues of facial expression recognition is the extraction and modeling of the temporal dynamics of facial emotions from videos. Additionally, the rapid growth of video data from various multimedia sources is becoming a serious concern. Therefore, to address these issues, in this paper, we introduce a novel approach on top of Spark for facial expression understanding from videos. First, we propose a new dynamic feature descriptor, namely, the local directional structural pattern from three orthogonal planes (LDSP-TOP), which analyzes the structural aspects of the local dynamic texture. Second, we design a 1-D convolutional neural network (CNN) to capture additional discriminative features. Third, a long short-term memory (LSTM) autoencoder is employed to learn the spatiotemporal features. Finally, an extensive experimental investigation is carried out to demonstrate the performance and scalability of the proposed framework.

Highlights

  • Facial expression understanding has gained more attention from computer vision researchers since it plays a significant role in numerous applications, such as sociable robots, mental disorder diagnosis, intelligent homes, driver fatigue monitoring, and entertainment [1]

  • We propose a novel framework on top of Spark for dynamic facial expression recognition that addresses the aforementioned concerns by effectively and efficiently processing videos

  • We propose a new dynamic texture descriptor, the local directional structural pattern from three orthogonal planes (LDSP-TOP), that provides a stable representation of facial emotions

Read more

Summary

INTRODUCTION

Facial expression understanding has gained more attention from computer vision researchers since it plays a significant role in numerous applications, such as sociable robots, mental disorder diagnosis, intelligent homes, driver fatigue monitoring, and entertainment [1]. Dynamic texture-based descriptors, e.g., volume local binary patterns (VLBP) [7], local binary patterns from three orthogonal planes (LBP-TOP) [7], and multiresolution-based LBP-TOP [8], have become very popular for facial expression recognition due to their computational efficiency and effectiveness. These works fail to extract a stable description of facial expressions. We propose a novel framework on top of Spark for dynamic facial expression recognition that addresses the aforementioned concerns by effectively and efficiently processing videos.

RELATED WORK
EXPERIMENTAL ANALYSIS
Findings
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.