Abstract To effectively address the emotional well-being of elderly individuals living alone using of home care robotic systems, it is essential to possess the ability to precisely identify facial expressions within complex domestic settings. Facial expression recognition (FER) in complex environments faces significant challenges due to factors such as facial occlusions. To address this challenge, this paper proposes a method called the Dual-Branch Attention and Multi-Scale Feature Fusion Network (DAMFF-Net). First, we perform feature extraction on facial images and input the resulting feature maps into an improved dual-branch attention fusion module (DBAF) to capture long-range dependencies between different facial regions. Simultaneously, using the residual multi-scale module that we designed, we obtain fine-grained multi-scale features to ensure that both preceding and subsequent feature subsets contain rich scale information. Next, we globally fuse the feature maps from the feature extraction stage with those from the residual multi-scale module to enhance facial expression recognition accuracy in cases where certain facial feature regions are occluded. Finally, we employ decision-level fusion in order to deal with the categorization findings. Experiments were conducted on the RAF-DB, CK+ and AffectNet-7 datasets, and comparative results indicated that the proposed method achieved a respective enhancement of 5.79%, 6.68% and 5.86% in facial expression recognition accuracy.
Read full abstract