Facial Emotion Recognition Using Asymmetric Pyramidal Networks With Gradient Centralization

Huanyu Zang,Simon Y Foo,Anke Meyer-Baese,Shonda Bernadin

doi:10.1109/access.2021.3075389

Abstract

Facial expression recognition (FER) is a promising but challenging area of Computer Vision (CV). Many researchers have devoted significant resources to exploring FER in recent years, but an impediment remains: classifiers perform well on fine resolution images but have difficulty recognizing in-the-wild human emotional states. In order to solve the aforementioned issue, we introduced three novel designs and implemented them in neural networks. More specifically, we utilized an asymmetric pyramidal network (APNet) and employed multi-scale kernels instead of identical size kernels. In addition, square kernels were replaced by a sequence of square, horizontal, and vertical convolutions. This structure can increase the description ability of convolutional neural networks (CNN) and transfer multi-scale features between different layers. Additionally, when training CNN, we adopted stochastic gradient descent with gradient centralization (SGDGC) where it centralizes gradients to have zero mean and makes the training process more efficient and stable. To verify the effectiveness of APNet with SGDGC, we used three of the most popular in-the-wild emotion datasets, FER-2013, CK+, and JAFFE, for our experiments. The results of our experiment and comparisons with state-of-the-art designs from others demonstrate that our method outperforms all the single model methods and has comparable performance with model fusion methods.

Highlights

According to Mehrabian’s survey [1], verbal components only convey one-third of the information that humans want to express; the other two-thirds are conveyed through non-verbal components
ASYMMETRIC PYRAMIDAL NETWORK In section II, we introduced several prior studies by other researchers that focused on asymmetric convolutions and multi scale blocks or networks, but none have combined these two techniques together and inserted them into a network
When we examined the findings of prior studies [44], [48], [49] alongside Eq.3, we observed that the only difference between centralized gradient [∇gradient centralization (GC) L(W )] and standard gradient [∇L(W )] is a deducted mean value from the weight vector or weight matrix

Summary

INTRODUCTION

According to Mehrabian’s survey [1], verbal components only convey one-third of the information that humans want to express; the other two-thirds are conveyed through non-verbal components. When using the decomposition technique described above, their Inception-v3 model yielded remarkable results in several sub-fields of CV Another experiment using asymmetric blocks was conducted by Ma et al [24], who applied a creative kernel shape into CNN and called the new architecture RotateConv. Xie et al [16] found that grouped convolution could improve the accuracy of classification as well as reduce training time Cognizant of these two advantages, we adopted a similar strategy to capture the features of different levels of layers and applied grouped convolution into APNet. For each layer in the APNet, we divided a 3 × 3 block sequence (square kernel, vertical kernel, and horizontal kernel) into 1 group, a 5 × 5 block sequence into 4 groups, and a 7 × 7 block sequence into 8 groups.

EXPERIMENTS

CONCLUSION

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2021
Citations: 40	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Facial Emotion Recognition Using Asymmetric Pyramidal Networks With Gradient Centralization

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Acceleration of Deep Neural Network Training Using Field Programmable Gate Arrays.
Guta Tesema Tufa ... Akansha Singh
Computational intelligence and neuroscience | VOL. 2022
Guta Tesema Tufa, et. al.Guta Tesema Tufa ... Akansha Singh
17 Oct 2022
Computational intelligence and neuroscience | VOL. 2022

Comparative characteristics of the ability of convolutional neural networks to the concept of transfer learning
Vladimir Khotsyanovsky
Technology audit and production reserves | VOL. 1
Vladimir KhotsyanovskyVladimir Khotsyanovsky
11 Feb 2022
Technology audit and production reserves | VOL. 1

Classification Mechanism of Convolutional Neural Network for Facial Expression Recognition
Yongpei Zhu ... Kehong Yuan
-
Yongpei Zhu, et. al.Yongpei Zhu ... Kehong Yuan
01 Jan 2020
01 Jan 2020

Real-Time CNN Training and Compression for Neural-Enhanced Adaptive Live Streaming.
Seunghwa Jeong ... Seunghoon Cha
IEEE transactions on pattern analysis and machine intelligence | VOL. 46
Seunghwa Jeong, et. al.Seunghwa Jeong ... Seunghoon Cha
01 Sep 2024
IEEE transactions on pattern analysis and machine intelligence | VOL. 46

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Facial Emotion Recognition Using Asymmetric Pyramidal Networks With Gradient Centralization

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access