Abstract

Silent speech recognition (SSR) is a system that can recognize verbal expressions when speech signals are not accessible. It has found versatile applications not only in helping those who suffer a speech impediment, but also in human–computer interaction. In this paper, we propose an SSR system based on surface electromyography (sEMG) signals. We first design a Mandarin corpus which contains 135-class utterances and collect 8-channel sEMG data from 12 subjects. After preprocessing the raw sEMG signals, we extract a time–frequency hybrid feature map for each sample. Then, we use two data augmentation strategies to obtain more data in order to improve the performance of our model. Finally, we test our proposed models on two tasks: the multi-subject mixed classification task and the single subject classification task. We use a 5-fold cross-validation to evaluate our model on both tasks. The GRU model performs best, achieving a recognition accuracy of 88.01% on the multi-subject mixed classification task and a maximum recognition accuracy of 97.19% on the single subject classification task. Experiments results demonstrate the effectiveness of our proposed system.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call