This paper proposes a facial expression recognition network called the Lightweight Facial Network with Spatial Bias (LFNSB). The LFNSB model effectively balances model complexity and recognition accuracy. It has two key components: a lightweight feature extraction network (LFN) and a Spatial Bias (SB) module for aggregating global information. The LFN introduces combined channel operations and depth-wise convolution techniques, effectively reducing the number of parameters while enhancing feature representation capability. The Spatial Bias module enables the model to focus on local facial features while capturing the dependencies between different facial regions. Additionally, a new loss function called Cosine-Harmony Loss is designed. This function optimizes the relative positions of feature vectors in high-dimensional space, resulting in better feature separation and clustering. Experimental results on the AffectNet and RAF-DB datasets demonstrate that the LFNSB model achieves competitive recognition accuracy, with 63.12% accuracy on AffectNet-8, 66.57% accuracy on AffectNet-7, and 91.07% accuracy on RAF-DB, while significantly reducing the model complexity.
Read full abstract