A Multimodal Multilevel Converged Attention Network for Hand Gesture Recognition With Hybrid sEMG and A-Mode Ultrasound Sensing.

Sheng Wei,Yue Zhang,Honghai Liu

doi:10.1109/tcyb.2022.3204343

Sheng Wei, Yue Zhang + Show 1 more

https://doi.org/10.1109/tcyb.2022.3204343

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Gesture recognition based on surface electromyography (sEMG) has been widely used in the field of human-machine interaction (HMI). However, sEMG has limitations, such as low signal-to-noise ratio and insensitivity to fine finger movements, so we consider adding A-mode ultrasound (AUS) to enhance the recognition impact. To explore the influence of multisource sensing data on gesture recognition and better integrate the features of different modules. We proposed a multimodal multilevel converged attention network (MMCANet) model for multisource signals composed of sEMG and AUS. The proposed model extracts the hidden features of the AUS signal with a convolutional neural network (CNN). Meanwhile, a CNN-LSTM (long-short memory network) hybrid structure extracts some spatial-temporal features from the sEMG signal. Then, two types of CNN features from AUS and sEMG are spliced and transmitted to a transformer encoder to fuse the information and interact with sEMG features to produce hybrid features. Finally, the classification results are output employing fully connected layers. Attention mechanisms are used to adjust the weights of feature channels. We compared MMCANet's feature extraction and classification performance with that of manually extracted sEMG-AUS features using four traditional machine-learning (ML) algorithms. The recognition accuracy increased by at least 5.15%. In addition, we tried deep learning (DL) methods with CNN on single modals. The experimental results showed that the proposed model improved 14.31% and 3.80% over the CNN method with single sEMG and AUS, respectively. Compared with some state-of-the-art fusion techniques, our method also achieved better results.

Full Text