Hand gesture recognition (HGR) based on surface electromyogram (sEMG) and accelerometer (ACC) signals has attracted increasing attention. The design of an effective multimodal fusion strategy plays a vital role. Existing studies fuse sEMG and ACC either in a low-level data space or in a high-level decision space. These fusion strategies result in insufficiently representing intramodal specificity and intermodal association simultaneously. Meanwhile, current fusion methods tend to neglect multiscale characteristics and hierarchical relationships and rarely specifically take into account the possible low discrimination in similar gestures. To address these issues, we propose a novel and practical hybrid fusion (HyFusion) model with multiscale attention and metric learning, to improve the performance of HGR based on sEMG and ACC. In HyFusion, a fusion framework including parallel intramodal and intermodal branches is first designed to adequately explore intramodal specificity and intermodal association simultaneously. Furthermore, the fusion of multiscale hierarchical features in each branch is conducted through a multiscale spatial attention (SA) module to generate more effective features. Finally, with the joint supervision of softmax and metric learning loss, HyFusion extracts features that maximize intraclass aggregation and interclass differentiation, thus particularly facilitating distinguishing similar gestures. The proposed HyFusion is validated on three public datasets (Ninapro DB2, DB3, and DB7), involving transradial amputees and healthy subjects. The results demonstrate that HyFusion achieves state-of-the-art performance.
Read full abstract