Abstract

Facial expression detection is a method to predict human facial emotions. This work is a trending research topic that can be implemented for human-robot interaction. More recently, deep convolutional neural network provides a robust extractor features but tends to be slow in real-time implementations and often requires a large memory and graphics processing units for fast execution. In this article, an efficient CPU-based facial expression detector is proposed using a sequential attention network to improve the baseline performance. The proposed attention network consists of three modules, global representation to capture the global features, channel representation, and dimension representation, which are focused on the channel and using spatial attention to discriminate local features. The efficient partial transfer module is also presented as a light backbone to extract facial features from an image. The entire module is trained and tested on several benchmarks to classify seven facial expressions. As a result, the proposed model reaches an accuracy of 98.18%, 98.75%, 95.63%, and 74.17% on CK+, JAFFE, KDEF, and FER-2013, respectively. It achieves competitive performance when compared to state-of-the-art methods. Lastly, it is integrated with a face detector and runs in real-time without a constraint at 69 frames per second on a CPU.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call