Abstract

Recent silent speech recognition (SSR) studies based on surface electromyography (sEMG) have been conducted by classifying a finite number of words or phrases without sufficient understanding of temporally semantic information compared to sequential decoding at a fine-grained syllable or phoneme level. This paper presents a syllable-level sequential decoding method using a transformer model for sEMG-based SSR. The proposed method consists of a transformer model and a language model. The input sEMG data was first translated into a sequence of syllable-level decisions by the transformer model. Then, these sequential syllable-level decisions were tuned as a final syllable sequence to approximate natural language through the language model. To verify the effectiveness of the proposed method, experiment data were recorded using two high-density electrode arrays with 64 channels from a total of eight subjects during subvocally reading a corpus of 33 Chinese phrases generated from a dictionary of 82 syllables. The proposed method achieved the lowest character error rate of 5.14 ± 3.28 % and the highest phrase recognition accuracy of 96.37 ± 2.06 %, and it significantly outperformed other common methods for sEMG-based SSR. These findings demonstrated the feasibility and usability of the proposed method for practical SSR applications.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call