Abstract

Group-level emotion recognition (GER) is challenging since it significantly relies on different individual facial expressions, complex group relationships, and contextual scene information. Due to complicated emotion interactions and emotion bias among multiple emotion cues, current techniques still fall short when it comes to detecting complex group emotion. In this study, we propose a context-consistent cross-graph neural network (ConGNN) for accurate GER in the wild. It can model multi-cue emotional relations and alleviate emotion bias among different cues, thus obtaining the robust and consistent group emotion representation. In ConGNN, we first extract the facial, local object, and global scene features to form multi-cue emotion features. Then, we develop a cross-graph neural network (C-GNN) for modeling inter- and intra-branch emotion relations, obtaining a comprehensive cross-branch emotion representation. To alleviate the effect of emotion bias during C-GNN training, we propose an emotion context-consistent learning mechanism with an emotion bias penalty to help obtain context-consistent group emotion, and then achieve robust GER. Furthermore, we create a new, more realistic benchmark, SiteGroEmo, and use it to evaluate ConGNN. Extensive experiments on two challenging GER datasets (GroupEmoW and SiteGroEmo) demonstrate that our ConGNN outperforms state-of-the-art techniques, with relative accuracy gains of 3.35% and 4.32%, respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call