Symmetric Multi-Scale Residual Network Ensemble with Weighted Evidence Fusion Strategy for Facial Expression Recognition

Juan Liu,Zhong Huang,Ying Wang,Julang Jiang,Min Hu

doi:10.3390/sym15061228

Abstract

To extract facial features with different receptive fields and improve the decision fusion performance of network ensemble, a symmetric multi-scale residual network (SMResNet) ensemble with a weighted evidence fusion (WEF) strategy for facial expression recognition (FER) was proposed. Firstly, aiming at the defect of connecting different filter groups of Res2Net only from one direction in a hierarchical residual-like style, a symmetric multi-scale residual (SMR) block, which can symmetrically extract the features from two directions, was improved. Secondly, to highlight the role of different facial regions, a network ensemble was constructed based on three networks of SMResNet to extract the decision-level semantic of the whole face, eyes, and mouth regions, respectively. Meanwhile, the decision-level semantics of three regions were regarded as different pieces of evidence for decision-level fusion based on the Dempster-Shafer (D-S) evidence theory. Finally, to fuse the different regional expression evidence of the network ensemble, which has ambiguity and uncertainty, a WEF strategy was introduced to overcome conflicts within evidence based on the support degree adjustment. The experimental results showed that the facial expression recognition rates achieved 88.73%, 88.46%, and 88.52% on FERPlus, RAF-DB, and CAER-S datasets, respectively. Compared with other state-of-the-art methods on three datasets, the proposed network ensemble, which not only focuses the decision-level semantics of key regions, but also addresses to the whole face for the absence of regional semantics under occlusion and posture variations, improved the performance of facial expression recognition in the wild.

Full Text